# Requests

Before diving further into web-scraping and API calls, let's go over the basics of the `requests` module.

The `requests` module is a package that we can use to pull various web-files over the web. This includes `HTML`, `JSON`, and `XML` files. To get started, install the following module using the code-block below.

Also, check out the following documentation for additional guidance: https://requests.readthedocs.io/en/latest/user/quickstart/ 

In [1]:
!pip install requests



After installing, let's "request" a simple HTML using the `get` function and get some attributes from this site.

In [2]:
import requests

# NOTE: You can place any url you want in the `get()` function, although there will be no guarantee that you can get this data...
r = requests.get("https://www.scrapethissite.com/pages/simple/")

# let's print out the encoding of this page, as well as the "status" code of this page
print(r.encoding)
print(r.status_code)

utf-8
200


Notice that we can get some basic information on this website, but what if we want to get the actual **content** (that is the HTML) of this website. What do we use instead?

Quite simply, we can use the `text` attribute. Note that this will change depending on the resource that you want!

* text: for websites  
* json(): for API calls
* content: for binary content (images, BLOBs, etc)  

In [3]:
r.text

'<!doctype html>\n<html lang="en">\n  <head>\n    <meta charset="utf-8">\n    <title>Countries of the World: A Simple Example | Scrape This Site | A public sandbox for learning web scraping</title>\n    <link rel="icon" type="image/png" href="/static/images/scraper-icon.png" />\n\n    <meta name="viewport" content="width=device-width, initial-scale=1.0">\n    <meta name="description" content="A single page that lists information about all the countries in the world. Good for those just get started with web scraping.">\n\n    <link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/css/bootstrap.min.css" rel="stylesheet" integrity="sha256-MfvZlkHCEqatNoGiOXveE8FIwMzZg4W85qfrfIFBfYc= sha512-dTfge/zgoMYpP7QbHy4gWMEGsbsdZeCXz7irItjcC3sPUFtf0kuFbDz/ixG7ArTxmDjLXDmezHubeNikyKGVyQ==" crossorigin="anonymous">\n    <link href=\'https://fonts.googleapis.com/css?family=Lato:400,700\' rel=\'stylesheet\' type=\'text/css\'>\n    <link rel="stylesheet" type="text/css" href="/static/css/styles.css"

Much like `f.read()` from yesterday, this gives us the actual data inside of the site!

## JSON

Now that we have a handle on how to extract data from the web, let's utilize this to pull information from a Web API & interact with JSON files.

This file is used to transmit (free or paid) resources to a specific endpoint. While Web APIs are often used in the world of web-development, it's also applicable to our roles as data scientists/analysts/engineers.

In [8]:
r = requests.get("https://pokeapi.co/api/v2/pokemon/pikachu")

data = r.text

print(data[0])

{


Note that we should not treat data extracted from URL just as string, instead...

In [18]:
r = requests.get("https://pokeapi.co/api/v2/pokemon/pikachu")

data = r.json()

print(data["abilities"])

[{'ability': {'name': 'static', 'url': 'https://pokeapi.co/api/v2/ability/9/'}, 'is_hidden': False, 'slot': 1}, {'ability': {'name': 'lightning-rod', 'url': 'https://pokeapi.co/api/v2/ability/31/'}, 'is_hidden': True, 'slot': 3}]


This should be loaded using the `json()` method. Afterwards, we can apply basic programming structures to gain access to data.

In [19]:
r = requests.get("https://pokeapi.co/api/v2/pokemon/pikachu")

data = r.json()

for ab in data["abilities"]:
    print(ab)

{'ability': {'name': 'static', 'url': 'https://pokeapi.co/api/v2/ability/9/'}, 'is_hidden': False, 'slot': 1}
{'ability': {'name': 'lightning-rod', 'url': 'https://pokeapi.co/api/v2/ability/31/'}, 'is_hidden': True, 'slot': 3}


# Stock API Exercise 

Follow along with the listed directions below to implement some basic data engineering via the `requests`,`json`, and `pandas` packages.

By the end of this analysis, we should be able to convert a JSON object into a pandas DataFrame.

We will then use these dataframes for some basic analysis.

In [5]:
import requests

# PART 1

# TODO: paste in your key here
key = "ew40TV8x5hU21pKer_dLiBYNUH0sbWvE"

# TODO: Using the "aggregates" endpoint, make a URL that will request stock data from BA (Boeing) from the date
# ranges of 2024-01-01 to 2024-03-11
# Notice that most of the URL is filled in for you. You just need to fill int the `BA` stock ticker after `ticker`
# As well as `2024-01-01` after `day`, and then `2024-03-11` 
# URL: https://polygon.io/docs/stocks/get_v2_aggs_ticker__stocksticker__range__multiplier___timespan___from___to 

# url = f"url= https://api.polygon.io/v2/aggs/ticker/AAPL/range/1/day/2023-01-09/2023-01-09?adjusted=true&sort=asc&limit=120&apiKey=ew40TV8x5hU21pKer_dLiBYNUH0sbWvE

r = requests.get(url)

# TODO: print out the request status of this GET request
r.status_code

200

In [12]:
# PART 2

# TODO: Load this data as a JSON object 
data= r.json()
# TODO: and then print out the 'results' value of this `data` JSON object. 
# What do you notice is the structure of this JSON object?
# Is it just one single object, or multiple?
day= data ['results'][0]
print(day ['o'])
print (day ['c'])

257.28
251.76


In [13]:
# PART 3

# TODO: Using a for-loop, loop through each day of day and print out the closing price as well as the opening price of
# each respective day
# NOTE: The "o" key represents "opening price", whereas the "c" key represents "closing price"

for day in data["results"]:
    print(day["v"], day['t'])

5815219.0 1704171600000
7219930.0 1704258000000
5170739.0 1704344400000
3849746.0 1704430800000
40730433.0 1704690000000
20687539.0 1704776400000
12883738.0 1704862800000
11830489.0 1704949200000
11285521.0 1705035600000
34972141.0 1705381200000
20136513.0 1705467600000
20045457.0 1705554000000
14345444.0 1705640400000
10760764.0 1705899600000
9050418.0 1705986000000
15119120.0 1706072400000
22108472.0 1706158800000
9911195.0 1706245200000
7515898.0 1706504400000
13309589.0 1706590800000
22409376.0 1706677200000
10679209.0 1706763600000
6275335.0 1706850000000
8682514.0 1707109200000
7561923.0 1707195600000
7747653.0 1707282000000
5787652.0 1707368400000
4349671.0 1707454800000
4077413.0 1707714000000
8263634.0 1707800400000
6494702.0 1707886800000
5309060.0 1707973200000
5986529.0 1708059600000
5248399.0 1708405200000
4179808.0 1708491600000
6494041.0 1708578000000
7433353.0 1708664400000
4807216.0 1708923600000
3932875.0 1709010000000
9654653.0 1709096400000
6670260.0 1709182800000
1

In [14]:
# PART 4
open_price = []
close_price = []

# TODO: now that we know how to access this data programmatically, let's loop through our results and save
# opening prices and closing prices into the two respective lists
for day in data["results"]:
    open_price.append(day['o'])
    day['c']

# print out these lists to confirm that you've successfully saved data
print(open_price)
print(close_price)

[257.28, 248.32, 244.58, 245.04, 228, 225.66, 226.9, 228.07, 219.97, 210.07, 202.63, 205.64, 210.89, 213.07, 215.35, 209.83, 208.2, 203.08, 206.06, 203.65, 204.92, 213.84, 209.06, 204.64, 206.02, 208.76, 212.4, 209.77, 208.7, 206.9, 205.95, 204.12, 204.88, 203.55, 202.9, 202, 200.99, 201.01, 200.93, 201.14, 206.44, 204, 199.5, 199.49, 201.77, 200.7, 201.84, 194.21]
[]


In [18]:
# PART 5

import pandas as pd

# Next, let's create a DataFrame using this these two lists
data = {
    "open": open_price,
    "close": close_price
}

df = pd.DataFrame(data)

ValueError: All arrays must be of the same length

Complete the listed analysis below using this dataframe

In [19]:
# TODO: Calculate summary statistics on opening price and closing price for Boeing stock  
df.head()

NameError: name 'df' is not defined

In [None]:
# TODO: plot a line plot on all available Boeing stock price
df.describe()

In [None]:
# TODO: Create a new column that is a calculation of the ratio of opening to closing stock price
df["open"].plot()