**Basic Recommendation System** using **Named Entity Recognition** (NER) from **spaCy**

The recommend_content function:

- Converts the user's list of entities into a set for efficient searching.
- Iterates through each item in the content database.
- Extracts entities from the text of each content item.
- Recommends content if any of the user’s entities match any of the entities extracted from the content.
- Returns a list of recommended content that includes at least one of the user’s specified entities.

In [1]:
import spacy

# Load the spaCy model
nlp = spacy.load('en_core_web_sm')

# Function to extract entities from text
def extract_entities(text):
    doc = nlp(text)
    entities = [ent.text.lower() for ent in doc.ents]  # Convert to lower() for ease of comparison
    return entities

# Sample content data (books, news, or articles)
content_database = [
    {'title': 'Tech Innovations in 2024', 'text': 'John Doe discusses the latest tech innovations in San Francisco.'},
    {'title': 'Travel Tips for New York', 'text': 'Explore the best travel destinations in New York with our guide.'},
    {'title': 'Political News Today', 'text': 'Recent political events involving John Doe in Washington.'}
]

# Entity profiles (can be expanded or customized further)
entity_profiles = {
    'john doe': {'type': 'person', 'interests': ['technology', 'politics']},
    'san francisco': {'type': 'location', 'interests': ['technology']},
    'new york': {'type': 'location', 'interests': ['tourism', 'real estate']},
    'washington': {'type': 'location', 'interests': ['politics']}
}

def recommend_content(user_entities, content_database):
    recommendations = []
    user_entities = set(entity.lower() for entity in user_entities)  # Convert to set to improve search efficiency
    for content in content_database:
        content_entities = extract_entities(content['text'])
        if any(entity in user_entities for entity in content_entities):
            recommendations.append(content)
    return recommendations

# Example user input
user_input = "I am interested in John Doe and San Francisco."
user_entities = extract_entities(user_input)

# Recommend content based on extracted entities
recommended = recommend_content(user_entities, content_database)

# Display recommendations
for item in recommended:
    print(f"Recommended: {item['title']}")


Recommended: Tech Innovations in 2024
Recommended: Political News Today


The **recommend_content** function performs the following tasks:

- Converts the user's list of entities into a set for efficient searching.
- Iterates through each piece of content in the database.
- Extracts entities from each content item and converts them into a set.
- Checks if all user entities are a subset of the content's entities.
- If all user entities are present in the content, the content is added to the list of recommendations.
- Returns the list of recommended content.

In [2]:
import spacy

# Load the spaCy model
nlp = spacy.load('en_core_web_sm')

# Function to extract entities from text
def extract_entities(text):
    doc = nlp(text)
    entities = [ent.text.lower() for ent in doc.ents]  # Convert to lower() for ease of comparison
    return entities

# Sample content data (books, news, or articles)
content_database = [
    {'title': 'Tech Innovations in 2024', 'text': 'John Doe discusses the latest tech innovations in San Francisco.'},
    {'title': 'Travel Tips for New York', 'text': 'Explore the best travel destinations in New York with our guide.'},
    {'title': 'Political News Today', 'text': 'Recent political events involving John Doe in Washington.'}
]

# Entity profiles (can be expanded or customized further)
entity_profiles = {
    'john doe': {'type': 'person', 'interests': ['technology', 'politics']},
    'san francisco': {'type': 'location', 'interests': ['technology']},
    'new york': {'type': 'location', 'interests': ['tourism', 'real estate']},
    'washington': {'type': 'location', 'interests': ['politics']}
}

def recommend_content(user_entities, content_database):
    recommendations = []
    user_entities = set(entity.lower() for entity in user_entities)  # Convert to set to improve search efficiency
    for content in content_database:
        content_entities = extract_entities(content['text'])
        content_entities_set = set(content_entities)
        # Check if all user entities are present in content entities
        if user_entities.issubset(content_entities_set):
            recommendations.append(content)
    return recommendations

# Example user input
user_input = "I am interested in John Doe and San Francisco."
user_entities = extract_entities(user_input)

# Recommend content based on extracted entities
recommended = recommend_content(user_entities, content_database)

# Display recommendations
for item in recommended:
    print(f"Recommended: {item['title']}")


Recommended: Tech Innovations in 2024
