# End of week 1 exercise

To demonstrate your familiarity with OpenAI API, and also Ollama, build a tool that takes a technical question,  
and responds with an explanation. This is a tool that you will be able to use yourself during the course!

In [1]:
# imports
import os
from dotenv import load_dotenv
from openai import OpenAI

In [2]:
load_dotenv()

api_key = os.getenv("OPENAI_API_KEY")

if not api_key:
    raise ValueError("No se encontró OPENAI_API_KEY")

client = OpenAI(api_key=api_key)

In [3]:

MODEL_GPT = 'gpt-4o-mini'

In [4]:
system_prompt = """
You are a serious assistant that analyzes code.
Explain clearly what each section does.
Respond in markdown.
"""

user_prompt_prefix = """
Here is the code. Summarize what it does:

"""

In [21]:
def summarize(code_text):
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_prefix + code_text}
    ]

    response = client.chat.completions.create(
        model=MODEL_GPT,
        messages=messages,
        temperature=0
    )

    return response.choices[0].message.content


In [22]:
code_example = """
def fetch_website_links(url):
    response = requests.get(url, headers=headers)
    soup = BeautifulSoup(response.content, "html.parser")
    links = [link.get("href") for link in soup.find_all("a")]
    return [link for link in links if link]
"""

In [23]:
print(display(Markdown(summarize(code_example))))

Certainly! Here’s a breakdown of the provided code:

```python
def fetch_website_links(url):
```
- This line defines a function named `fetch_website_links` that takes a single parameter, `url`. This parameter is expected to be a string representing the URL of a website.

```python
    response = requests.get(url, headers=headers)
```
- This line sends an HTTP GET request to the specified `url` using the `requests` library. The `headers` variable (which should be defined elsewhere in the code) is included in the request to provide additional information to the server (like user-agent, etc.). The server's response is stored in the variable `response`.

```python
    soup = BeautifulSoup(response.content, "html.parser")
```
- Here, the content of the `response` (which is the HTML of the webpage) is parsed using the `BeautifulSoup` library. The `soup` variable now contains a BeautifulSoup object that allows for easy navigation and searching of the HTML structure.

```python
    links = [link.get("href") for link in soup.find_all("a")]
```
- This line creates a list comprehension that iterates over all `<a>` (anchor) tags found in the `soup` object. For each anchor tag, it retrieves the value of the `href` attribute (which typically contains the URL the link points to). The resulting list of URLs is stored in the variable `links`.

```python
    return [link for link in links if link]
```
- Finally, this line returns a new list that includes only the non-empty links from the `links` list. It filters out any `None` or empty string values, ensuring that the returned list contains only valid URLs.

### Summary
The `fetch_website_links` function takes a URL as input, retrieves the HTML content of the webpage at that URL, extracts all the hyperlinks (URLs) from the anchor tags, and returns a list of valid (non-empty) URLs.

None


In [None]:
import requests
requests.get("http://localhost:11434").content

In [9]:

OLLAMA_BASE_URL = "http://localhost:11434/v1"


ollama = OpenAI(base_url=OLLAMA_BASE_URL, api_key='ollama')

In [15]:
def summarize(code_text):
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_prefix + code_text}
    ]

    response = ollama.chat.completions.create(
        model="deepseek-r1:1.5b",
        messages=messages,
        temperature=0
    )

    return response.choices[0].message.content

In [18]:
from IPython.display import Markdown, display

print(display(Markdown(summarize(code_example))))

The provided code extracts website links from a given URL using HTML parsing. Here's a step-by-step explanation:

1. **Requesting the URL**: The function makes a GET request to the provided URL with headers set for authentication.

2. **Parsing HTML Content**: Using BeautifulSoup, it parses the HTML content of the response page.

3. **Extracting Links**: It extracts all "a" tags from the parsed HTML, which represent links on the website.

4. **Filtering Valid Links**: The function filters out any empty or invalid URLs from the list of extracted links.

5. **Returning the Result**: Finally, it returns a list containing only the valid website links.

This function is typically used to retrieve actual websites that are served by an API endpoint, allowing for further processing such as data extraction or analysis.

None


In [19]:
def summarize(code_text):
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_prefix + code_text}
    ]

    response = ollama.chat.completions.create(
        model="llama3.2",
        messages=messages,
        temperature=0
    )

    return response.choices[0].message.content

In [20]:
print(display(Markdown(summarize(code_example))))

**Function Summary**
=====================

The `fetch_website_links` function takes a URL as input and returns a list of extracted website links.

**Code Breakdown**
-----------------

### 1. `requests.get(url, headers=headers)`

*   This line sends a GET request to the specified URL using the `requests` library.
*   The `headers` parameter is used to pass additional HTTP headers with the request.

### 2. `soup = BeautifulSoup(response.content, "html.parser")`

*   After receiving the response from the server, this line parses the HTML content of the webpage using the `BeautifulSoup` library.
*   The `"html.parser"` argument specifies the parser to use for parsing the HTML.

### 3. `links = [link.get("href") for link in soup.find_all("a")]`

*   This list comprehension finds all `<a>` tags (hyperlinks) within the parsed HTML and extracts their `href` attributes.
*   The extracted links are stored in a list called `links`.

### 4. `return [link for link in links if link]`

*   Finally, this line filters out any empty strings from the `links` list using a conditional expression.
*   The resulting filtered list of links is returned by the function.

**Example Use Case**
--------------------

```python
import requests
from bs4 import BeautifulSoup

# Set up headers for the request
headers = {"User-Agent": "My Web Scraper"}

# Define the URL to fetch links from
url = "https://www.example.com"

# Call the fetch_website_links function
links = fetch_website_links(url)

# Print the extracted links
for link in links:
    print(link)
```

This code snippet demonstrates how to use the `fetch_website_links` function to extract links from a given URL.

None
