Skip to content

alkkuma1/ThoughtEmbedding_AlloyDB_VertexAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Thought Embedding Application

This repository contains a simple Streamlit application for embedding and analyzing user thoughts using PostgreSQL and Principal Component Analysis (PCA).

image

Table of Contents

Cloud Setting

  1. Create a Google Cloud Account
  2. Create an AlloyDB Cluster
  3. Create an AlloyDB instance

API

Enable below APIs:

  1. AlloyDB API
  2. Compute Engine API
  3. Cloud Resource Manager API
  4. Service Networking API

Queries

Create table

CREATE TABLE "public".thought_embedding ( thought TEXT, thought_id SERIAL PRIMARY KEY, entry_date TIMESTAMP );

Enable below extensions

CREATE EXTENSION IF NOT EXISTS vector CREATE EXTENSION IF NOT EXISTS google_ml_integration

Query to autogenerate vectors

ALTER TABLE thought_embedding ADD COLUMN embedding vector GENERATED ALWAYS AS (embedding('textembedding-gecko@001',thought)) STORED;

Functions Overview

import_embedding.py

  • getconn(): Establishes connection to PostgreSQL using Google Cloud AlloyDB.

  • inserttodb(thought: str): Inserts a user thought into the PostgreSQL database.

  • getembedding(): Retrieves all stored thought embeddings from the database.

  • similar_thoughts(thought): Finds top 3 similar thoughts based on vector similarity.

stream_app.py

  • User Interface: Collects and records thoughts, displaying similar entries and PCA plot.

utils.py

  • compute_pca(): Performs PCA on embeddings, returning principal components for visualization.

Requirements

  • Python 3.8+
  • PostgreSQL
  • Google Cloud AlloyDB
  • Required Python packages:
pip install sqlalchemy streamlit pandas matplotlib scikit-learn pg8000

About

This project is a Streamlit app that captures user thoughts, stores them in AlloyDB, auto generates the embeddings, retrieves similar entries using embeddings, and visualizes them with Principal Component Analysis (PCA).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages