# Profile Analysis with GenAI

This project is a variation of the **company_analysis** project found in the same repository. While it follows a similar structure, it focuses on helping professionals streamline their networking process by summarizing a person‚Äôs background and key details.

You can use this project to introduce yourself to individuals who share common interests or are working on initiatives that you‚Äôd like to join. 

The final report generated by the LLM includes information such as:
- **Background**
- **Expertise**
- **Core Values**
- **Leadership Style**
- **Relevant Accomplishments**
- **Insights for Connection**
- **Networking Opportunities**

As in the previous project, the information highlighted by the LLM is fully customizable. You can simply adjust the prompt to focus on what is most relevant to you.


In [1]:
# imports

import os
import requests
import json
from typing import List
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI

## Objective

This project aims to:  
- Extract relevant professional profile details  
- Process and structure the gathered information  
- Utilize **GPT-4o mini** to analyze and summarize key insights  

### **Workflow Overview**

#### **1. Data Collection**  
The notebook retrieves the data from a primary source, such as a professional website, portfolio, blog or LinkedIn page. It then processes and structures this information into a meaningful format for further analysis.  

#### **2. AI-Powered Analysis**  
The structured data is analyzed using **GPT-4o mini**, which extracts insights based on predefined key business indicators.  

#### **3. Results and Interpretation**  
The AI-generated insights are carefully reviewed and interpreted to help users achieve their primary goal: building valuable professional connections.
 
#### **4. Conclusion**
By following this structured, we can uncover valuable insights into each professional‚Äôs background, expertise, and goals. These insights empower individuals and organizations to make well-informed networking decisions, discover new collaboration opportunities, and strengthen professional relationships.
  


## üìö Importing Necessary Libraries and APIs

We start by importing the necessary libraries for data handling and LLM integration. Then, we retrieve our **OpenAI API key** to authenticate requests and define the **GPT-4o mini** model for analyzing the extracted content. 

In [2]:
# Initialize and constants

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

if api_key and api_key.startswith('sk-proj-') and len(api_key)>10:
    print("API key looks good so far")
else:
    print("There might be a problem with your API key? Please visit the troubleshooting notebook!")
    
MODEL = 'gpt-4o-mini'
openai = OpenAI()

API key looks good so far


## üõ†Ô∏è Data Extraction and Preparation

Using **BeautifulSoup**, we create a function to extract content from a given website. We then focus on using the links found on each page to provide additional context for the LLM‚Äôs analysis.


In [3]:
# A class to represent a Webpage

# Some websites need you to use proper headers when fetching them:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:
    """
    A utility class to represent a Website that we have scraped, now with links
    """

    def __init__(self, url):
        self.url = url
        response = requests.get(url, headers=headers)
        self.body = response.content
        soup = BeautifulSoup(self.body, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        if soup.body:
            for irrelevant in soup.body(["script", "style", "img", "input"]):
                irrelevant.decompose()
            self.text = soup.body.get_text(separator="\n", strip=True)
        else:
            self.text = ""
        links = [link.get('href') for link in soup.find_all('a')]
        self.links = [link for link in links if link]

    def get_contents(self):
        return f"Webpage Title:\n{self.title}\nWebpage Contents:\n{self.text}\n\n"

## Profile analysis


## ü§ñ Profile Analysis with GPT-4o Mini

We use the LLM in two steps:

1. First, we pass the links extracted from the webpage (via BeautifulSoup) to the LLM and ask it to determine which links are most relevant to our goal. We also request that the results be returned in **JSON** format.  
2. Next, a second LLM uses this JSON output to generate a comprehensive report, highlighting the following points: **Company Background, Mission and Values, Leadership and Key Figures, Workplace Culture, Job Opportunities,** and **Insights for Job Seekers**.


In [4]:
linkedin_profile = Website("https://www.linkedin.com/company/fairsupply-com-au-pty-limited/people/")

In [5]:
link_system_prompt_profile = (
    "You are provided with a list of links found on a LinkedIn profile page. "
    "Decide which of these links might be relevant for someone seeking a thorough professional profile analysis, "
    "such as links to notable achievements, projects, or other professional connections. "
    "Exclude Terms of Service, Privacy policy, or email links. "
    "You should respond in JSON as in this example:"
)
link_system_prompt_profile += """
{
    "links": [
        {"type": "relevant profile link", "url": "https://www.linkedin.com/in/john-doe"}
    ]
}
"""


In [6]:
print(link_system_prompt_profile)

You are provided with a list of links found on a LinkedIn profile page. Decide which of these links might be relevant for someone seeking a thorough professional profile analysis, such as links to notable achievements, projects, or other professional connections. Exclude Terms of Service, Privacy policy, or email links. You should respond in JSON as in this example:
{
    "links": [
        {"type": "relevant profile link", "url": "https://www.linkedin.com/in/john-doe"}
    ]
}



In [7]:
def get_links_profile_prompt(website):
    user_prompt = (
        f"Here is the list of links on the website of {website.url}. "
        "Please decide which of these are relevant for a thorough professional profile analysis. "
        "Specifically, we are looking for pages or resources that help assess the individual's background, expertise, "
        "achievements, values, leadership style, or day-to-day responsibilities. "
        "Provide your response in JSON format, listing the full https URLs. "
        "Do not include Terms of Service, Privacy policy, or email links.\n"
        "Links (some might be relative links):\n"
    )
    user_prompt += "\n".join(website.links)
    return user_prompt

In [8]:
print(get_links_profile_prompt(linkedin_profile))

Here is the list of links on the website of https://www.linkedin.com/company/fairsupply-com-au-pty-limited/people/. Please decide which of these are relevant for a thorough professional profile analysis. Specifically, we are looking for pages or resources that help assess the individual's background, expertise, achievements, values, leadership style, or day-to-day responsibilities. Provide your response in JSON format, listing the full https URLs. Do not include Terms of Service, Privacy policy, or email links.
Links (some might be relative links):
/
/checkpoint/rp/request-password-reset?session_redirect=https%3A%2F%2Fwww%2Elinkedin%2Ecom%2Fcompany%2Ffairsupply-com-au-pty-limited%2Fpeople%2F
/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Fcompany%2Ffairsupply-com-au-pty-limited%2Fpeople%2F
/legal/user-agreement
/legal/privacy-policy
/legal/cookie-policy
/legal/user-agreement?trk=d_checkpoint_lg_consumerLogin_ft_user_agreement
/legal/privacy-policy?trk=d_checkpoint_l

In [9]:
def get_profile_links(url):
    website = Website(url)
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": link_system_prompt_profile},
            {"role": "user", "content": get_links_profile_prompt(website)}
        ],
        response_format={"type": "json_object"}
    )
    result = response.choices[0].message.content
    return json.loads(result)


In [10]:
# Checking links from LinkedIn on Fair Supply LinkedIn page
get_profile_links("https://www.linkedin.com/in/arne-geschke-8bb61287/")

{'links': [{'type': 'relevant profile link',
   'url': 'https://au.linkedin.com/in/arne-geschke-8bb61287'},
  {'type': 'posts link',
   'url': 'https://www.linkedin.com/posts/arne-geschke-8bb61287_a-massive-step-for-fair-supply-as-we-are-activity-7289581656296898561-LKU2'},
  {'type': 'posts link',
   'url': 'https://www.linkedin.com/posts/julian-fenwick-415848_fair-supply-sustainability-summer-school-activity-7293029143137374208-4yTg'},
  {'type': 'posts link',
   'url': 'https://www.linkedin.com/posts/arne-geschke-8bb61287_esg-climatechange-activity-7285809882321010689-zZNh'},
  {'type': 'posts link',
   'url': 'https://www.linkedin.com/posts/arne-geschke-8bb61287_google-for-esg-fair-supply-raises-12m-activity-7285183552483115011-smff'},
  {'type': 'posts link',
   'url': 'https://www.linkedin.com/posts/benderson_this-year-at-fair-supply-has-been-one-of-activity-7275453053074350084-3FWJ'},
  {'type': 'courses link',
   'url': 'https://www.linkedin.com/learning/c-plus-plus-design-patt

In [11]:
def get_all_details(url):
    result = "Landing page:\n"
    result += Website(url).get_contents()
    links = get_links(url)
    print("Found links:", links)
    for link in links["links"]:
        result += f"\n\n{link['type']}\n"
        result += Website(link["url"]).get_contents()
    return result

In [12]:
def get_all_profile_details(url):
    result = "Linkedin page:\n"
    result += Website(url).get_contents()
    links = get_profile_links(url)
    print("Found links:", links)
    for link in links["links"]:
        result += f"\n\n{link['type']}\n"
        result += Website(link["url"]).get_contents()
    return result


In [13]:
print(get_all_profile_details("https://www.linkedin.com/in/arne-geschke-8bb61287/"))

Found links: {'links': [{'type': 'relevant profile link', 'url': 'https://au.linkedin.com/in/arne-geschke-8bb61287'}, {'type': 'relevant post link', 'url': 'https://www.linkedin.com/posts/arne-geschke-8bb61287_a-massive-step-for-fair-supply-as-we-are-activity-7289581656296898561-LKU2'}, {'type': 'relevant post link', 'url': 'https://www.linkedin.com/posts/arne-geschke-8bb61287_esg-climatechange-activity-7285809882321010689-zZNh'}, {'type': 'relevant post link', 'url': 'https://www.linkedin.com/posts/arne-geschke-8bb61287_google-for-esg-fair-supply-raises-12m-activity-7285183552483115011-smff'}, {'type': 'relevant post link', 'url': 'https://www.linkedin.com/posts/julian-fenwick-415848_fair-supply-sustainability-summer-school-activity-7293029143137374208-4yTg'}, {'type': 'relevant learning course link', 'url': 'https://www.linkedin.com/learning/machine-learning-with-data-reduction-in-excel-r-and-power-bi?trk=public_profile_recommended-course'}, {'type': 'relevant learning course link', 

In [14]:
system_prompt_profile = "You are an assistant that analyzes the contents of a professional profile, \
focusing on a thorough profile analysis. Provide a concise overview of the person‚Äôs background, \
expertise, core values, leadership style (if applicable), and relevant accomplishments. Respond in \
Markdown format. Include any details that would be helpful for a job searcher seeking to connect \
with this individual, such as insights into day-to-day responsibilities, professional development, \
or notable achievements, if available."


In [18]:
def get_profile_analysis(profile_name, url):
    user_prompt = f"You are looking at a professional named: {profile_name}\n"
    user_prompt += (
        "Here are the contents of their LinkedIn page and any other relevant pages. "
        "Use this information to create a concise overview of this individual's background, expertise, "
        "core values, leadership style (if applicable), and relevant accomplishments. "
        "Include any insights that would be helpful for someone seeking to connect with this person, "
        "such as day-to-day responsibilities, professional development, or notable achievements. "
        "Respond in Markdown.\n"
    )
    user_prompt += get_all_profile_details(url)
    user_prompt = user_prompt[:20_000]  # Truncate if more than 20,000 characters
    return user_prompt



In [19]:
def create_profile_analysis(profile_name, url):
    stream = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt_profile},
            {"role": "user", "content": get_profile_analysis(profile_name, url)}
          ],
        stream=True
    )
    
    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [21]:
create_profile_analysis("Ed Donner", "https://www.linkedin.com/in/eddonner/")

Found links: {'links': [{'type': 'relevant profile link', 'url': 'https://www.linkedin.com/in/eddonner'}, {'type': 'relevant profile link', 'url': 'https://www.linkedin.com/posts/mert-adal%C4%B1-120726229_as-an-junior-mobile-developer-with-a-background-activity-7295793892308746240-MCc7'}, {'type': 'relevant profile link', 'url': 'https://www.linkedin.com/posts/sanjeev-b-ba4078247_ai-agenticai-openai-activity-7295082083398336513-4Lyl'}, {'type': 'relevant profile link', 'url': 'https://www.linkedin.com/posts/graphnick_how-easy-is-it-to-scrape-a-webpage-with-ai-activity-7296188959259340800-pfLE'}, {'type': 'relevant profile link', 'url': 'https://www.linkedin.com/in/gregtoroosian?trk=public_profile_relatedPosts_face-pile-cta'}, {'type': 'relevant profile link', 'url': 'https://www.linkedin.com/posts/gregtoroosian_machineminds-techpodcast-ai-activity-7231423440698662912-cRZ2?trk=public_profile_relatedPosts'}, {'type': 'relevant profile link', 'url': 'https://www.linkedin.com/in/jonathon-j


# Profile Overview: Ed Donner

## Background
Ed Donner is a seasoned technology leader and entrepreneur based in New York City. He has a strong academic foundation with education from the University of Oxford. His career has been marked by a focus on leveraging artificial intelligence (AI) to innovate within the travel industry, specifically as a co-founder of Simplified.Travel.

## Expertise
- **Technology Leadership**: Extensive experience in leading technological initiatives from conception through execution, particularly in the application of AI in travel.
- **Entrepreneurship**: Proven track record of building and scaling startups, with a focus on creating user-centric technologies.
- **AI Applications**: Deep understanding of how AI can transform business processes, particularly in streamlining travel experiences through personalization and automation.

## Core Values
Ed emphasizes values such as innovation, customer-centricity, and adaptability in the face of rapidly changing technologies. He appears to advocate for the responsible deployment of AI in ways that genuinely benefit end-users.

## Leadership Style
Ed is likely to adopt a collaborative leadership style, fostering open communication and fostering a team-oriented environment. His background suggests he values input from diverse perspectives to drive creative solutions and encourage team innovation.

## Relevant Accomplishments
- **Simplified.Travel**: Co-founder of a startup aimed at enhancing travel experiences through innovative tech solutions.
- **AI Innovations**: Actively involved in applying AI technologies to enhance operational efficiencies and customer experiences in the travel sector.

## Insights for Connection
- **Day-to-Day Responsibilities**: Involvement in strategic planning, development cycles of new technologies, and team management at Simplified.Travel.
- **Professional Development**: Ed stays updated on AI trends and their business applications, participating actively in discussions and publications within his network.
- **Notable Achievements**: While no specific metrics are detailed in the available information, his impact in the tech domain, especially with Simplified.Travel, indicates significant contributions to both the industry and user engagement.

## Networking Opportunities
Individuals looking to connect with Ed should consider:
- Engaging in discussions about AI and its practical implications within travel or technology.
- Sharing insights or innovations that relate to customer experience enhancement through technology.
- Asking about his experiences and insights on building a tech startup in the competitive landscape of AI and travel.

By connecting with Ed, peers and aspiring entrepreneurs can gain valuable insights from his journey and expertise in harnessing technology for transformational change.
