<a href="https://colab.research.google.com/github/brendenwest/cis122/blob/main/10_data_retrieval.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Fetching Data

### Reading

- https://requests.readthedocs.io/en/latest/user/quickstart/
- https://www.tutorialspoint.com/http/index.htm
- https://www.w3schools.com/python/python_json.asp

### Learning Outcomes
- What is data retrieval?
- Basics of HTTP requests
- Making HTTP requests with Python
- Querying databases with Python

### What is Data Retrieval?

Programming often involves retrieving data from a source outside of your program. Commonly, sources are a file, a database, or an `API` (internet service).

We previously covered loading data from a file. This doc covers how to fetch data from an internet service or a database with Python.

### HTTP Basics

HTTP defines how a client can send a request to a server and what the response should look like.

HTTP methods define specific kinds of requests. The most common are:

- GET - request data from a server
- POST - send data to a server

#### HTTP GET

A GET request consists primarily of a `URL` (web address)

The URL may contain `query parameters` (name/value pairs) separated by `=` signs, as in this weather forecast example:

```
api.openweathermap.org/data/2.5/forecast/daily?lat=47.6062&lon=122.3321&cnt=5&appid=12345
```

#### HTTP POST

An HTTP POST contains data in request `body`.

Because the HTTP protocol limits the size of GET requests, POST is more often used to send large amounts of data to a server - e.g. form sumbissions & file uploades.

#### HTTP Headers

Requests & responses include `headers` that inform the receiver about the request or response.

An HTTP header consists of its case-insensitive name followed by a colon (:), then by its value.

```
content-type: application/json; charset=utf-8
```

#### HTTP Response

After receiving an HTTP request, a server should return a well-defined response.

The response typically includes:
- **status code** - a standard 3-digit integer that informs the receiver on success or failure of the request
- **headers** -  additional information about the response (e.g. content size, type, & last modified)
- **body** - Can be any data returned from the server

#### Content Types

HTTP servers return data in a defined format that clients should be able to understand.

Some common formats for sharing data between applications are:

- CSV - Comma-separated values

```
name, major, gpa
jim,art,3.8
sue,science,3.75

```

- JSON - JavaScript Object Notation

```
[
  {"name":"jim", "major":"art", "gpa": 3.8},
  {"name":"sue", "major":"science", "gpa": 3.75},
]

```

- XML - Extensible Markup Language

```
<?xml version="1.0" encoding="UTF-8" ?>
<root>
  <row>
    <name>jim</name>
    <major>art</major>
    <gpa>3.8</gpa>
  </row>
  <row>
    <name>sue</name>
    <major>science</major>
    <gpa>3.75</gpa>
  </row>
</root>
```

### HTTP Requests with Python