Skip to content

kimtth/visual-genius

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Ā 

History

53 Commits
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 
Ā 

Repository files navigation

Visual Genius: Communication Assistant

Most children with autism spectrum disorders (ASD) are visual learners. They tend to comprehend visual information better than auditory input, making visual supports more effective for their learning process.

The project was initiated due to the laborious creation of a visual card and a market demand for a better, cost-effective product. The aim is also to democratize AI, making it accessible to all and overcoming previous product limitations with the help of new technologies.

  • Visual aids product in the market: Bing Search Results / Can be quite costly Amazon Search Result
  • Applied behavior analysis (ABA) is a therapeutic approach for treating ASD.
  • Visual aids in applied behavior analysis (ABA) URL

Key Features

  1. Switching between personas and modes of generation (List, Steps, Manual / Parents and Caregivers, Childs)
  2. Visual Card generation and management (Set the order of images by Drag and Drop)
  3. Semantic Image search
  4. Video generation from images (To teach work procedures)
  5. Text-to-Image

Application Preview

git_preview.mp4
  1. Vector-based image search: Azure Cognitive Search & Computer Vision API for Vector embedding
  2. Text-to-image generation: Azure OpenAI GPT-3.5 & Image Generation by Azure OpenAI Dall-E

    Due to the Generation speed issue, only the last image will be generated by Dall-E.`

  3. Bing Image Search
  4. Fluent emoji dataset
  5. Azure Cognitive Services Speech to Text (Read the text on the card)
  6. [Optional] Microsoft Coco dataset (Everyday Life Images)

    The test dataset for Semantic Image Search. Semantic search seeks images based on their features, not by the associated metadata tags or the image file name.

Configure Development environment

Note: Please ensure you have installed nodejs and python3.

To preview and run the project on your device:

  1. Open project folder in Visual Studio Code
  2. In the terminal, run npm install
  3. Run npm run dev to view the project in a browser
  4. Run python app.py to launch the backend.

!important: react-beautiful-dnd was not able to work well with reactStrictMode: true in NextJs. Turn off the option at next.config.js.`

Configure Dataset

  • The [Optional] steps are needed for demonstration purposes and are not mandatory for deploying the application.

    1. Uploading your image data into Azure Blob Storage.

      • [Optional] dataset > data > Upload to Blob image container
      • dataset > emoji > Upload to Blob emoji container
    2. Image and Category metadata are managed on SQL database.

      • DB Creation: backend\infra\db_postgres.sql
      • [Optional] DB Data Generate: backend\util\postgre_gen_db_data.py
    3. Image search requires to creation of Azure Cognitive Search Index.

      • Azure Cognitive Search Index Creation: backend\util\acs_index_gen.py
      • [Optional] trigger indexer: The web skill (azure functions: acs_skillset_for_indexer) should be deployed before it is triggered.
    4. [Optional] Update and synchronize the 'sid' attribute in Azure Cognitive Search based on metadata from the SQL database.

      • backend\util\data_for_dev\acs_index_mapping_with_postgre.py

Data creation for development and Dataset. Please find the sample images in dataset and backend\util directories.

API Documentation (Swagger)

http://localhost:5000/docs

Configure for Deployment to Azure

  1. Deployment can be done using Azure Template or Azure Bicep.
  • Azure Template

    • Click the template button.

      Deploy to Azure

    • Convert Bicep to Json: url

  • Azure Bicep

    1. Deploy Azure Resources > backend\infra

    Set up your parameters for Azure Bicep.

    "prefix": {
        "value": "<your-value-for-prefix>"
    },
    "pgsqlId": {
        "value": "<your-postgre-sql-id>"
    },
    "pgsqlPwd": {
        "value": "<your-postgre-sql-password>"
    }
    1. Execute the script for Azure Bicep
    PS> .\main.ps1 -resourceGroup <your-resource-group-name> -location <your-resource-location>
  1. Build Next.js application
  • Execute the npm run build command. This will build UI code and create public directory in the backend.

    "scripts": {
        "dev": "next dev",
        "build": "next build && next export -o backend/public",
        "start": "next start",
        "lint": "next lint"
    }
  • The .env.production on root will be embedded into the javascript files.

    MS_CLARITY_ID= //[Optional]
    ENV_TYPE=prod
  1. Upload UI and Python code to Azure App Service (by Visual Code Extension: Azure Tools)

    1. Navigate to Azure Tools (Visual Code Extension) > Resources > your Azure Subscription > App Service > your App service.
    2. Click on Deploy to Web app...
    3. Select the backend directory as the target directory.
  2. To set up the start-up command at Azure App service.

    1. Open your Web App in the Azure Portal (portal.azure.com).

    2. Scroll to Configuration under Settings.

    3. Click on the General Settings tab.

    4. Enter the startup command.

      python app.py
  3. To set up environment variables in Azure App Service, you can follow these steps:

    • In the Azure Portal, locate your App Service.
    1. On the left pane, click on ā€œConfigurationā€.
    2. Under ā€œApplication settingsā€, click on ā€œNew application settingā€.
    3. Fill in the name and value for each environment variable:
    4. Click ā€œOKā€, then at the top, click "Save".
    • Most of the values will be mapped during the deployment.

      AZURE_SEARCH_SERVICE_ENDPOINT=https://?.search.windows.net
      AZURE_SEARCH_INDEX_NAME=
      AZURE_SEARCH_ADMIN_KEY=
      COGNITIVE_SERVICES_ENDPOINT=https://?.cognitiveservices.azure.com
      COGNITIVE_SERVICES_API_KEY=
      BLOB_CONNECTION_STRING=
      BLOB_CONTAINER_NAME=
      BLOB_EMOJI_CONTAINER_NAME=
      AZURE_OPENAI_ENDPOINT=https://?.openai.azure.com/
      AZURE_OPENAI_API_KEY=
      AZURE_OPENAI_API_VERSION_IMG=2023-06-01-preview
      AZURE_OPENAI_API_VERSION_CHAT=2023-07-01-preview
      BING_IMAGE_SEARCH_KEY=
      SPEECH_SUBSCRIPTION_KEY=
      SPEECH_REGION=
      POSTGRE_HOST=
      POSTGRE_USER=
      POSTGRE_PORT=5432
      POSTGRE_DATABASE=
      POSTGRE_PASSWORD=
      ENV_TYPE=PROD
      APP_SECRET_STRING= //JWT Token authentication key. e.g,. mysecret
    • [Optional]: backend/util/env_to_app_service_fmt.py: Convert the .env file to the appsettings.json for settings in Azure App Service.

      [
        {
          "name": "AZURE_SEARCH_SERVICE_ENDPOINT",
          "value": "https://?.search.windows.net"
        },
        {
          ...
        }
        ...
      ]
  4. PostgreSQL offers a vector search feature through the installation of the pgvector plugin. This feature can be utilized to implement image search, potentially serving as an alternative to Azure Cognitive Search.