#Learn to use Jina AI’s Reader API for webscraping and search grounding
---

Web Scraping is the process of extracting data from websites.


### Usecases
- Reader API solves extracting clean web content
- Reducing unwanted tokens
- LLMs benefit from structured, clean input
- Ease of use


In [None]:
! pip install requests



In [None]:
import requests

## Getting the API Key from Jina AI Reader API

In [None]:
api_key = "jina_a83d3c168d1b438894e425f5fee2295e_T9yWjnnSJ30MjSzHrP7G8yRF1_U" # Replace with your API key

In [None]:
# Define the Jina Reader API endpoint
api_url = "https://r.jina.ai"

api_key = api_key
url_to_scrape = "https://en.wikipedia.org/wiki/Tesla,_Inc."  # Replace with the URL you want to scrape

In [None]:
# Make a request to Jina Reader API
headers = {"Authorization": f"Bearer {api_key}"}
response = requests.get(f"{api_url}/{url_to_scrape}", headers=headers)

if response.status_code == 200:
    content = response.text
    print("Extracted content:", content)
else:
    print("Failed to scrape the content")

Extracted content: Title: Tesla, Inc.

URL Source: https://en.wikipedia.org/wiki/Tesla,_Inc.

Published Time: 2006-06-12T17:34:14Z

Markdown Content:
"Tesla Motors" redirects here. For electric motors invented by Nikola Tesla, see [Induction motor](https://en.wikipedia.org/wiki/Induction_motor "Induction motor") and [AC motor](https://en.wikipedia.org/wiki/AC_motor "AC motor").

Tesla, Inc.

[![](https://upload.wikimedia.org/wikipedia/commons/thumb/b/bd/Tesla_Motors.svg/130px-Tesla_Motors.svg.png)](https://en.wikipedia.org/wiki/File:Tesla_Motors.svg)

[![](https://upload.wikimedia.org/wikipedia/commons/thumb/b/b6/Gigafactory_Texas_Building_1_June_2022.jpg/250px-Gigafactory_Texas_Building_1_June_2022.jpg)](https://en.wikipedia.org/wiki/File:Gigafactory_Texas_Building_1_June_2022.jpg)

[Gigafactory Texas](https://en.wikipedia.org/wiki/Gigafactory_Texas "Gigafactory Texas"), Tesla's [headquarters](https://en.wikipedia.org/wiki/Headquarters "Headquarters"), in [Austin, Texas](https://en.wi

## Getting the API Key from Google GEMINI

In [None]:
Gemini_api_key = "AIzaSyAbKC_dCcmOm6ZRl0STjWCkRbXR0LfQ5Qs"

In [None]:
import google.generativeai as genai
import os

genai.configure(api_key=Gemini_api_key)

model = genai.GenerativeModel(model_name="gemini-1.5-flash")


In [None]:
response = model.generate_content("What are the things that happened with tesla in 2024?")
print(response.text)

I do not have access to real-time information, including events that have happened in the past. This means I can't tell you what happened with Tesla in 2024. 

To find information about Tesla in 2024, you can:

* **Check reputable news sources:** Look at financial news websites, tech blogs, and major news outlets for articles about Tesla's performance in 2024.
* **Visit Tesla's official website:** The Investor Relations section of Tesla's website will often have press releases and financial reports.
* **Search for specific events:** If you're interested in a particular event, like a new product launch or a change in leadership, you can search for it online.

Remember to be wary of unreliable sources and verify information from multiple sources. 



In [None]:
question = "What are the things that happened in 2024?"

# Rag

In [None]:
response = model.generate_content(f"Use the provided content:{content} to answer the following questions. {question}")
print(response.text)

Here are the key events that happened in 2024 according to the provided text:

* **Tesla moved its incorporation from Delaware to Texas.** (June 2024)
* **Musk stated the Roadster and the Tesla next-generation vehicle should enter production in 2025.** (July 2024)
* **Tesla abandoned its plan for next-generation gigacasting.** (May 2024)
* **Tesla secured a deal with Tata Electronics to supply semiconductor chips.** (April 2024)
* **Tesla laid off 10% of its employees.** (April 2024)
* **Almost all major North America EV manufacturers announced plans to switch to Tesla's NACS adapters on their EVs by 2025.** (May 2023 to February 2024)

The provided text doesn't give details about the reasons behind these events, so you might need to look for more recent sources to get a deeper understanding of the context. 



###Search Grounding
LLMs have a knowledge cut-off, meaning they can't access the latest world knowledge. This leads to problems such as misinformation, outdated responses, hallucinations, and other factuality issues. Grounding is absolutely essential for GenAI applications.

In [None]:
# Define the Jina Reader API endpoint
api_url_grounding = "https://s.jina.ai"

api_key = api_key
question = "who will most likely win US Election 2024?"  # Replace with the Question

In [None]:
# Make a request to Jina Reader API
headers = {"Authorization": f"Bearer {api_key}"}
response = requests.get(f"{api_url_grounding}/{question}", headers=headers)

if response.status_code == 200:
    content = response.text
    print("Extracted content:", content)
else:
    print("Failed to scrape the content")

Extracted content: [1] Title: Who Is Favored To Win The 2024 Presidential Election? | FiveThirtyEight
[1] URL Source: https://projects.fivethirtyeight.com/2024-election-forecast/
[1] Description: 538’s 2024 presidential <strong>election</strong> forecast model showing Democrat Kamala Harris’s and Republican Donald Trump’s chances of <strong>winning</strong>.


[2] Title: Tracking the Swing States for Harris and Trump - The New York Times
[2] URL Source: https://www.nytimes.com/interactive/2024/us/elections/presidential-election-swing-states.html
[2] Description: The presidential race <strong>will</strong> <strong>most</strong> <strong>likely</strong> come down to voters in eight states that remain competitive.


[3] Title: 2024 Presidential Election Predictions | The Hill and DDHQ
[3] URL Source: https://elections2024.thehill.com/forecast/2024/president/
[3] Description: Decision Desk HQ and The Hill’s ultimate hub for polls, predictions, and election results. ... Our model currently p

In [None]:
response = model.generate_content(f"Use the provided content:{content} to answer the following questions. {question}")
print(response.text)

Based on the provided information, the 2024 US presidential election is incredibly close, but the content leans slightly towards Kamala Harris winning. 

Here's a breakdown of the information:

* **Polls:**  While the polls show Harris slightly ahead of Trump, they are very close, and the race is described as "exceptionally close." 
* **Swing States:**  The outcome likely hinges on the crucial swing states, which are currently competitive. This means either candidate could win them.
* **Electoral College Forecasts:**  While specific forecasts are not mentioned, it is noted that analysts and pundits are issuing probabilistic forecasts for the Electoral College, using various factors to estimate the likelihood of each candidate winning.

**Important Considerations:**

* **The content was published in 2015:**  The Wikipedia article, while updated with some information about the 2024 election, was originally published in 2015.  A lot can change in the political landscape in almost a decade

## Wrap-Up
Jina AI Reader API can easily be integrated in workflows.
- Webscrapping
- Gemini Integration
- Search grounding