Why AI at Nexora?

The future of search and recommendations is AI driven. I am excited about Nexora because it focuses on building intelligent real-world recommendation tools. This project shows how AI/ML can understand human vibes and match people in a smarter, data-driven way.


In [10]:

import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity


In [11]:
data = {
    "user_id": [1,2,3,4,5],
    "name": ["Aarav", "Riya", "Kabir", "Ananya", "Dev"],
    "interests": [
        "music movies football",
        "reading music painting",
        "travel food fitness",
        "coding chess books",
        "movies travel cricket"
    ],
    "personality": [
        "introvert calm creative",
        "extrovert friendly energetic",
        "adventurous social active",
        "introvert logical thinker",
        "social fun energetic"
    ]
}

df = pd.DataFrame(data)
df


Unnamed: 0,user_id,name,interests,personality
0,1,Aarav,music movies football,introvert calm creative
1,2,Riya,reading music painting,extrovert friendly energetic
2,3,Kabir,travel food fitness,adventurous social active
3,4,Ananya,coding chess books,introvert logical thinker
4,5,Dev,movies travel cricket,social fun energetic


In [12]:
tfidf = TfidfVectorizer()
feature_matrix = tfidf.fit_transform(df['interests'] + ' ' + df['personality'])


In [13]:
similarity = cosine_similarity(feature_matrix)
similarity


array([[1.        , 0.12702491, 0.        , 0.1230389 , 0.13631701],
       [0.12702491, 1.        , 0.        , 0.        , 0.1317529 ],
       [0.        , 0.        , 1.        , 0.        , 0.2635058 ],
       [0.1230389 , 0.        , 0.        , 1.        , 0.        ],
       [0.13631701, 0.1317529 , 0.2635058 , 0.        , 1.        ]])

In [14]:
def recommend(user_index):
    scores = list(enumerate(similarity[user_index]))
    scores = sorted(scores, key=lambda x: x[1], reverse=True)
    print("✅ Top matches for:", df.iloc[user_index]['name'])
    for idx, score in scores[1:4]:
        print(df.iloc[idx]['name'], "- Similarity Score:", round(score, 2))

recommend(0)   # change 0 to test others


✅ Top matches for: Aarav
Dev - Similarity Score: 0.14
Riya - Similarity Score: 0.13
Ananya - Similarity Score: 0.12


In [6]:
recommend(0)   # Aarav
recommend(1)   # Riya
recommend(2)   # Kabir


✅ Top matches for: Aarav
Dev - Similarity Score: 0.14
Riya - Similarity Score: 0.13
Ananya - Similarity Score: 0.12
✅ Top matches for: Riya
Dev - Similarity Score: 0.13
Aarav - Similarity Score: 0.13
Kabir - Similarity Score: 0.0
✅ Top matches for: Kabir
Dev - Similarity Score: 0.26
Aarav - Similarity Score: 0.0
Riya - Similarity Score: 0.0


In [9]:
good = 0
total = 3
scores = []

for u in [0,1,2]:
    s = sorted(list(enumerate(similarity[u])), key=lambda x: x[1], reverse=True)[1][1]
    scores.append(s)
    if s > 0.7:
        good += 1

print("✅ Average similarity score:", sum(scores)/len(scores))
print("✅ Good matches (score > 0.7):", good, "/", total)


✅ Average similarity score: 0.17719190462208542
✅ Good matches (score > 0.7): 0 / 3


Reflection
- Cosine similarity successfully matched similar users.
- TF-IDF worked well for personality + interests.
- Could be improved by adding embeddings from OpenAI.
- Can integrate Pinecone / Milvus for real vector DB.
- Edge case: if user has no similar interests, return lowest but still closest matches.
