Table of Contents
Social Vegan is essentially a dating app for serious daters with embeddings and vector database. It's a project to explore the possibility of using OpenAI's embedding model to build match-making services.
In this version, we collect users' core value on intimate relationship (limited to 100 words, stored in the variable expectation
), pip this into OpenAI's text-embedding-ada-002
model, and store the vectors (or the embeddings) on Pinecone vector database and local database (SQLite3
based). For each user's vector, we then retrieve the nearest 2
vectors from the database, which represent people with most similar expecatation, and return the corresponding user_id
s as match results. Using the user_id
s, we can then retrieve the user's profile from the database, display it to the user, compare it to the users and finally let them decide whether to contact the other person.
- Python, as the main programming language.
- OpenAI
text-embedding-ada-002
model, as the main embedding model.gpt-3.5-turbo-1106
model, to perform completion task
- Pinecone, as the vector database.
- SQLite3, as the local database
To get a local copy up and running follow these simple example steps.
- get an OpenAI api key at OpenAI.
- get a Pinecone api key at Pinecone
- create a Pinecone index on Pinecone dashboard, and set the index name to
socialvegan
. - export the API_KEYs to environment variables
export OPENAI_API_KEY=<your key> export PINECONE_API_KEY=<your key>
- clone this repo to your local machine
git clone https://github.com/madeyexz/social_vegan.git
- Install the dependencies
pip install openai, pinecone, sqlite3, tenacity
-
Modify the user data (as of current, the preset data)
In
main.py
, you will find the following code snippet:# main.py # ... def main(): # ... preset_data = [ ["James", 34, True, True, "US", "james34@example.com", "Jame's core value"], ["Sophia", 28, False, True, "UK", "sophia28@example.co.uk", "Sophia's core value",] # and some more ] # ...
Each person's data is stored in the
preset_data
list, and each person's data is stored in a list. The order of the data is as follows:[name, age, is_male, is_heterosexual, city, email, expectation]
You can modified preset data
- the number of the preset datas is unlimited, but keep in mind that large number of data will slow down the program, as we will encounter
RateLimitError
when we try to embed the data.
An example of
expectation
is:In an intimate relationship, I value honesty and communication above all. Trust is the foundation of any strong relationship, and it's something I take very seriously. I believe in being open and transparent with my partner, sharing thoughts, feelings, and experiences freely. Mutual respect is also paramount; respecting each other's individuality, space, and opinions helps in nurturing a healthy bond. I also think it's important to support each other's goals and dreams, as growing together strengthens the relationship. Lastly, a sense of humor and the ability to enjoy life's simple moments together make every day special.
- the number of the preset datas is unlimited, but keep in mind that large number of data will slow down the program, as we will encounter
-
Modify the person you want to query at the end of the file by changing the index
# main.py # ... def main(): # ... # get the result of person with index 0 user_id = ids[0] # ...
-
Run the program
python3 main.py
-
You should see results like this
You (Aisha) are matched with James with a score of 0.954092443!
Your expectation is: I believe the core of any intimate relationship is mutual respect and empathy. Understanding and caring for each other's emotional and physical well-being is paramount. I value honesty and integrity; being truthful and upfront builds trust. I also look for a sense of humor and light-heartedness in a partner, as I believe laughter and joy are essential in life. Supporting each other's ambitions and being each other's cheerleader in life's journey is something I hold in high regard.
James's expectation is: In an intimate relationship, I value honesty and communication above all. Trust is the foundation of any strong relationship, and it's something I take very seriously. I believe in being open and transparent with my partner, sharing thoughts, feelings, and experiences freely. Mutual respect is also paramount; respecting each other's individuality, space, and opinions helps in nurturing a healthy bond. I also think it's important to support each other's goals and dreams, as growing together strengthens the relationship. Lastly, a sense of humor and the ability to enjoy life's simple moments together make every day special., and his/her email is james34@example.com
It sounds like you and this person have similar expectations and values when it comes to intimate relationships. Here are five reasons why you might make a good match:
- Mutual respect and empathy: Both of you prioritize mutual respect and empathy in a relationship, showing that you are both considerate of each other's feelings and well-being.
- Honesty and transparency: You both value honesty and open communication, which creates a foundation of trust and understanding in the relationship.
- Support for each other's ambitions: Both of you emphasize the importance of supporting each other's goals and dreams, indicating that you are both willing to be each other's cheerleaders in life's journey.
- Sense of humor and enjoyment of life: You both appreciate a sense of humor and the ability to enjoy life's simple moments, suggesting that you can find joy and laughter in each other's company.
- Shared values in nurturing a healthy bond: Your expectations align in respecting each other's individuality, space, and opinions, indicating that you both understand the importance of nurturing a healthy and balanced relationship.
- You can't specity homosexual or heterosexual results (it's a bug), and as of current do not support non-binary genders.
- The embedding model is prone to
Rate Limit Error
- Create a UI, or a web interface
- Allow custom user input
- Collect user feedback
- Tweak the GPT prompt to make it more suitable for the task
- Try out different the user input (i.e. the
expectation
collected) to find out what's best for dating.
See the open issues for a full list of proposed features (and known issues).
This project ultilizes the OpenAI second generation embedding model text-embedding-ada-002
, the limitations & risks of using the model is described here. Some key points are:
- The models encode social biases, e.g. via stereotypes or negative sentiment towards certain groups.
- Models lack knowledge of events that occurred after Sep 2021.
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Distributed under the GNU Affero General Public License v3.0. See LICENSE.txt
for more information.
-
Email me: ian.xiao@stu.pku.edu.cn
-
Project Link: Social Vegan