# Module 2:
Make API call
- Collect data from API &amp; store in csv
- Scrape static websites &amp; store in csv
---
Scenario
---

1. You have been given a API with api documentation.
2. Make api call from python & store the resp in a python dict
2. (2A) *Optional*: Store the resp dict into a file (json)

3. Parse the following info from the API Response.

Example: URL 

https://min-api.cryptocompare.com/data/price?fsym=USD&tsyms=JPY,INR
```
{
"JPY": 107.93,
"INR": 84.82
}
```
4. Store that parsed infomation/data into a csv file.

| sno | from_symbol | to_symbol | price | datetime |
| ---  | ---  | ---  | ---  | --- |
| 1 | USD | INR | 71 | ... |
| 2 | USD | SGD | 1.37| ... |

use python `time` or `datetime` module to get the current time, when you make api call and store that into the csv file..


## Part 2: Scrape static websites


https://www.xe.com/currencyconverter/convert/?Amount=1&From=USD&To=SGD


![https://i.imgur.com/C4Eub9d.png](https://i.imgur.com/C4Eub9d.png)



Have a look at the html as well, to parse the exchange rate.

![https://i.imgur.com/vDAntcv.png](https://i.imgur.com/vDAntcv.png)

1. Look at the query parameter and decide how to pass inputs..
2. Use `bs4` library to parse the html as show below.
3. You have to extract the price shown here and save it into a csv file..

| sno | from_symbol | to_symbol | price | datetime |
| ---  | ---  | ---  | ---  | --- |
| 1 | USD | SGD | 1.37| ... |

--- 


** Try this if you completed the base workshop **
---
1. Store the same data into mongodb as well
  - create a `exchange` db in your mlab account.
  - create a collection `exchange-rates`
  - You can store sample record which looks like the following json.

```
{
    "from_symbol": "USD",
    'to_symbol' : "SGD",
    "price" : 1.37,
    "datetime" : "...."
}
```

## Solution to Part 1 -- ( initial by part runs )

In [1]:
fsym = "USD"
tsyms = "JPY,INR"

url = f"https://min-api.cryptocompare.com/data/price?fsym={fsym}&tsyms={tsyms}"
print(f"url = {url}")      # print to check URL

import requests, json, datetime, pathlib, csv

resp = requests.get(url)
x = datetime.datetime.now()   # Get current time

print(resp)              # If response = 200 , then it is successful.
print(str(x))            # print DateTime in a string format

print(resp.text)         # print resp.text out
print(type(resp.text))   # Check type , result is a str

url = https://min-api.cryptocompare.com/data/price?fsym=USD&tsyms=JPY,INR
<Response [200]>
2019-09-23 15:43:36.990244
{"JPY":107.68,"INR":74.62}
<class 'str'>


In [2]:
data = resp.json()            # Store the resp in a python dict

print(type(data))             # Check type , result = dict
print(data)                   # Display data in dict form.

<class 'dict'>
{'JPY': 107.68, 'INR': 74.62}


## Solution to Part 1 (2A) -- ( initial )

#####   --   (2A) Optional: Store the resp dict into a file (json)

In [3]:
with open('data.json', 'w') as fp:          # Save dict into a file with json
    json.dump(data, fp)

In [4]:
with open('data.json', 'r') as fp:          # Extract dict from a file into new variable
    data2 = json.load(fp)

data2      # Display data from new variable

{'JPY': 107.68, 'INR': 74.62}

In [5]:
with open('data.json', 'a+') as fp:          # Append dict into a file with json -- But can't read back issue.
    json.dump(data, fp)

In [6]:
import pandas as pd

# Convert json dataset from dict to DataFrame
dfs = pd.DataFrame.from_dict(data, orient='index')
dfs[2] = x     # Add datetime in to a new column inside DataFrame
dfs

Unnamed: 0,0,2
JPY,107.68,2019-09-23 15:43:36.990244
INR,74.62,2019-09-23 15:43:36.990244


## CSV write function

In [7]:
fname = "WS03_Part1_rates.csv"

def save_to_csv(fname, datainput):
    if not pathlib.Path(fname).exists():               # To check & create new Header/Title.
        with open(fname, mode='w', newline='') as f:
            csv_writer = csv.writer(f)
            row_header = ["from_symbol", "to_symbol", "price", "datetime"]
            csv_writer.writerow(row_header)            # To write data (Title/Header)
            
    with open(fname, mode='a', newline='') as f:       # To write data -- append
        csv.writer(f).writerow(datainput)              # To write data -- (simplify vs. 2 line @ title)
        
for i, (tsym, rate) in enumerate(data.items()):        # data ... from json
    row = [fsym, tsym, rate, x]
    save_to_csv(fname, row)                            # call def function to save into csv
    

# Solution to Part 1 -- ( FINAL )

In [8]:
fsym = "USD"
tsyms = "JPY,INR,SGD"

url = f"https://min-api.cryptocompare.com/data/price?fsym={fsym}&tsyms={tsyms}"
print(f"url = {url}")      # print to check URL

import requests, json, datetime, pathlib, csv

resp = requests.get(url)
x = datetime.datetime.now()   # Get current time
data = resp.json()            # Store the resp in a python dict

fname = "WS03_Part1_rates.csv"

def save_to_csv(fname, datainput):
    if not pathlib.Path(fname).exists():               # To check & create new Header/Title.
        with open(fname, mode='w', newline='') as f:
            csv_writer = csv.writer(f)
            row_header = ["from_symbol", "to_symbol", "price", "datetime"]
            csv_writer.writerow(row_header)            # To write data (Title/Header)
            
    with open(fname, mode='a', newline='') as f:       # To write data -- append
        csv.writer(f).writerow(datainput)              # To write data -- (simplify vs. 2 lines @ Title)
        

for i, (tsym, rate) in enumerate(data.items()):        # data ... from json
    row = [fsym, tsym, rate, x]
    save_to_csv(fname, row)                            # call def function to save into csv
    

url = https://min-api.cryptocompare.com/data/price?fsym=USD&tsyms=JPY,INR,SGD


In [9]:
import pandas as pd

df = pd.read_csv(fname)         # Read data from csv file.
df.index.name = "sno"           # Place a label for index.
df.index += 1                   # Define index start number. = 1

display(df)    # Display DataFrame with Pandas package.

Unnamed: 0_level_0,from_symbol,to_symbol,price,datetime
sno,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,USD,JPY,107.68,2019-09-23 15:43:36.990244
2,USD,INR,74.62,2019-09-23 15:43:36.990244
3,USD,JPY,107.69,2019-09-23 15:43:39.266619
4,USD,INR,74.62,2019-09-23 15:43:39.266619
5,USD,SGD,1.376,2019-09-23 15:43:39.266619


In [10]:
!pip install bs4



# Solution to Part 2 --- ( initial )

In [11]:
url = "https://www.xe.com/currencyconverter/convert/?Amount=1&From=USD&To=SGD"

import requests, json, datetime
from bs4 import BeautifulSoup

def get_html_data(url: str ) -> str:
    headers = {
        'user-agent': 'Chrome/76.0.3809.100 Safari/537.36'
    } 
    resp = requests.get( url, headers=headers )
    print(resp)           # A status_code of 200 means that the page downloaded successfully.
    return resp.text


data = get_html_data( url )
soup = BeautifulSoup(data, "html.parser")
# print(type(soup))

x = datetime.datetime.now()   # Get current time
# print(x)                    # Check if DateTime is ok


sum_tag = []
tags = soup.find('div', attrs = {'id':'reactContainer'}) 
# tags = soup.find_all('span', attrs={'class':'converterresult-toAmount'}


section = soup.find_all('div', attrs = {'id':'reactContainer'})

# a = soup.body
# a                 ## Becoz it is a Javascropt !!!! ... damn :()


<Response [200]>


In [12]:
data.find('reactContainer')


4171

In [13]:
from lxml import html
tree = html.fromstring(resp.content)

result = tree.xpath('//div[@class="converterresult-conversionTo"]/text()')
rate = tree.xpath('//span[@class="converterresult-toAmount"]/text()')

print(result , rate)


[] []


In [14]:
# soup.__dict__

In [15]:
[type(item) for item in list(soup.children)]

[bs4.element.Doctype,
 bs4.element.NavigableString,
 bs4.element.Tag,
 bs4.element.NavigableString]

In [16]:
# resp.content          # print out the HTML content of the page using the content property:

In [17]:
soup.find_all('span', class_='converterresult-toAmount')       # This is the KEY item ... can't get ... why ?

soup.find_all('div', class_='converterresult-conversionTo')
soup.find_all('div', id='reactContainer')


[<div id="reactContainer"></div>]

In [18]:
links = soup.find_all("div")
print(links)

tags = soup.find("div", id="reactContainer")

a = soup.body      # Becoz it is a Javascript !!!! ... damn :(
# a

[<div id="reactContainer"></div>]


# Solution to Part 2 --- ( FINAL )

In [20]:
from selenium import webdriver                         # package within a package
from selenium.webdriver.common.keys import Keys        # Keys is a class , not used here in this code.
from time import sleep
from bs4 import BeautifulSoup
import requests, datetime


def get_html_data_with_selenium(url):
    driver = webdriver.Chrome()
    driver.get( url )
    sleep(3)
    htmldata = driver.page_source
    sleep(1)
    driver.close()
    return htmldata


fname = "WS03_Part2_rates.csv"

def save_to_csv(fname, datainput):
    if not pathlib.Path(fname).exists():               # To check & create new Header/Title.
        with open(fname, mode='w', newline='') as f:
            csv_writer = csv.writer(f)
            row_header = [ "from_symbol" , "to_symbol" , "price" , "datetime" ]
            csv_writer.writerow(row_header)            # To write data (Title/Header)
            
    with open(fname, mode='a', newline='') as f:       # To write data -- append
        csv.writer(f).writerow(datainput)              # To write data -- (simplify vs. 2 lines @ Title)


_amt = "1"
_from = "USD"
_to = "SGD"
url = f"https://www.xe.com/currencyconverter/convert/?Amount={_amt}&From={_from}&To={_to}"
print(url)

data = get_html_data_with_selenium(url)
soup = BeautifulSoup(data, "html.parser")
print(type(soup))          # Checkpoint if data obtained is a class.

span_tag = soup.find_all("span")

for count_var in span_tag:     # Loop thru to find based on class tag info provided.
    class_tag = count_var.get("class")
    
    if class_tag and "converterresult-toAmount" in class_tag:
        result = float( count_var.getText() )
        break
        
print(result)              # Checkpoint if exchange data was obtained.

row = [ _from , _to , result , datetime.datetime.now() ]
save_to_csv(fname, row)    # Save data into CSV


https://www.xe.com/currencyconverter/convert/?Amount=1&From=USD&To=SGD
<class 'bs4.BeautifulSoup'>
1.37752


In [21]:
import pandas as pd

df = pd.read_csv(fname)         # Read data from csv file.
df.index.name = "sno"           # Place a label for index.
df.index += 1                   # Define index start number. = 1

display(df)    # Display DataFrame with Pandas package.

Unnamed: 0_level_0,from_symbol,to_symbol,price,datetime
sno,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,USD,SGD,1.37752,2019-09-23 15:45:15.654267
