## REST API Example

First, we will use the requests library to extract data from a public REST API. For this example, let's use the JSONPlaceholder API, a simple fake REST API for testing and prototyping.

In [None]:
import requests
import json

response = requests.get('https://jsonplaceholder.typicode.com/posts')
posts = response.json()

# Print the first post
print(json.dumps(posts[0], indent=4))


{
    "userId": 1,
    "id": 1,
    "title": "sunt aut facere repellat provident occaecati excepturi optio reprehenderit",
    "body": "quia et suscipit\nsuscipit recusandae consequuntur expedita et cum\nreprehenderit molestiae ut ut quas totam\nnostrum rerum est autem sunt rem eveniet architecto"
}


This code sends a GET request to the API and stores the response (which is in JSON format) in a list of dictionaries. The json.dumps() function is then used to print the first post in a nicely formatted manner.

For more complex APIs, you may need to handle things like pagination, API keys, or query parameters. For example, let's extract data from a REST API with pagination:

In [None]:
import requests

# A list to store all the posts
all_posts = []

# The base URL of the API
base_url = 'https://jsonplaceholder.typicode.com/posts'

# The maximum number of posts per page
max_posts_per_page = 10

# The number of pages to fetch
num_pages = 5

# For each page
for i in range(1, num_pages + 1):
    # The URL of the page
    url = f'{base_url}?_page={i}&_limit={max_posts_per_page}'

    # Send a GET request to the API
    response = requests.get(url)

    # Parse the response as JSON
    posts = response.json()

    # Add the posts to the list
    all_posts.extend(posts)

# Print the number of posts
print(f'Fetched {len(all_posts)} posts')


Fetched 50 posts


## Database Example

In [None]:
import sqlite3

# Connect to the database
conn = sqlite3.connect('example.db')

# Create a cursor
c = conn.cursor()

# Execute a query
c.execute('SELECT * FROM Posts')

# Fetch all rows from the query
rows = c.fetchall()

# Print the first row
print(rows[0])

# Close the connection
conn.close()


OperationalError: ignored

Python can interact with a wide variety of databases through different libraries. Here's an example of how to extract data from a PostgreSQL database using psycopg2:

In [3]:
import psycopg2

# Connect to the database
conn = psycopg2.connect(
    dbname='your_database_name',
    user='your_username',
    password='your_password',
    host='localhost'
)

# Create a cursor
cur = conn.cursor()

# Execute a query
cur.execute('SELECT * FROM your_table_name')

# Fetch all rows from the last command
rows = cur.fetchall()

# Print the first row
print(rows[0])

# Close the connection
conn.close()


ModuleNotFoundError: No module named 'psycopg2'

Don't forget to replace 'your_database_name', 'your_username', 'your_password', and 'your_table_name' with your actual database name, username, password, and table name.

## File example

Python can extract data from various types of files. Here's an example of extracting data from a CSV file using the pandas library:

In [2]:
import pandas as pd

# Read the CSV file
df = pd.read_csv('data.csv')

# Print the first 5 rows
print(df.head())

   order_id  product_id  customer_id  quantity_ordered  price_each
0         1           1           35                 4        40.0
1         2           7           33                 2        20.0
2         3          19           11                 2        10.0
3         4           4           48                 1        80.0
4         5           3           18                 2        40.0


In this code, the pandas read_csv() function is used to read the CSV file 'data.csv' into a DataFrame, which is a two-dimensional table-like data structure that can be manipulated in many ways.

We can extract data from Excel files, JSON files, and more. Here's an example of how to extract data from an Excel file:



In [None]:
import pandas as pd

# Read the Excel file
df = pd.read_excel('data.xlsx')

# Print the first 5 rows
print(df.head())


You can install the necessary libraries using pip:

In [None]:
#!pip install requests pandas psycopg2 openpyxl

### Extracting Data from APIs

1. **Requests**: This is a simple yet powerful library for making HTTP requests. It can be used to send all kinds of HTTP requests (GET, POST, PUT, DELETE, etc.) and to handle the responses. It's great for working with APIs.

2. **Tweepy**: This is a Python library for accessing the Twitter API. It simplifies the process of working with Twitter's RESTful API.

3. **PyGithub**: This is a Python library to access the GitHub REST API. This library allows you to manage your GitHub resources such as repositories, user profiles, and organizations.

### Extracting Data from Databases

1. **Psycopg2**: This is a PostgreSQL adapter for Python. It is used to connect Python with PostgreSQL.

2. **sqlite3**: This module is part of the standard Python library, and it allows you to work with SQLite databases.

3. **PyMySQL**: This library allows Python to connect with a MySQL database.

4. **SQLAlchemy**: This is a SQL toolkit and ORM (Object-Relational Mapper) that gives application developers the full power and flexibility of SQL.

### Web Scraping Libraries

1. **Beautiful Soup**: This library is used for parsing HTML and XML documents, which is often useful for web scraping.

2. **Scrapy**: This is an open-source web crawling framework that allows you to write spiders to scrape data from websites.

3. **Selenium**: This is a tool that allows you to automate browsers. While it's often used for testing web applications, it can also be used for web scraping when the website relies on JavaScript to load or display data.

4. **Requests-HTML**: This library combines the capabilities of Requests and Beautiful Soup in a single library, and adds the ability to parse JavaScript-rendered content.

Remember to install any of these libraries using pip before using them:

```bash
pip install requests tweepy PyGithub psycopg2 sqlite3 PyMySQL SQLAlchemy beautifulsoup4 scrapy selenium requests-html
```

This is a long command, and it will install many libraries. You may wish to only install the libraries that you actually need for your project.