# Web Client



## Libraries
- [webbrowser](https://docs.python.org/3/library/webbrowser.html)
- [urllib.parse](https://docs.python.org/3/library/urllib.parse.html#module-urllib.parse)
- **[requests](http://docs.python-requests.org/en/master/)**
- **[BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/)**: for "quick turnaround screen scraping projects"

#  Downloading an image file

In [None]:
# Download the GossetX SVG logo.
res = requests.get('https://gossetx.com/imgs/logo.svg')
with open('/tmp/logo.svg', 'bw') as f:
    f.write(res.content)

# Call a website's published API

In [None]:
import requests
URL = "https://en.wikipedia.org/w/api.php?action=query&format=json&list=random&rnnamespace=0&rnlimit=5"
res = requests.get(URL)

In [None]:
# The response from the wikipedia API is JSON format, 
# and the requests response object gives us a convenient
# ".json()" method to get a dict object back.
jsonData = res.json()

In [None]:
titles = [article['title'] for article in jsonData['query']['random']]

In [None]:
[f"https://en.wikipedia.org/wiki/{t}".replace(' ', '_') 
 for t in titles]

In [None]:
print(f"STATUS CODE: {res.status_code}")
data = res.json()
[f"https://en.wikipedia.org/wiki/{article['title']}" for article in data['query']['random']]

# Scraping data from  a website

Scraping is extracting data from the raw HTML (usually) of a website when:

1. The website doesn't provide a public API, or
2. You (for whatever, hopefully legal, reason) don't want to use the public API.

## Pre-requisite: Learn CSS selectors

A great tutorial is: https://flukeout.github.io/

In [None]:
import requests

# References

- [logging (stdlib docs)](https://docs.python.org/3/library/logging.html)
- [When to use logging](https://docs.python.org/3/howto/logging.html#when-to-use-logging)

# Exercise: Web Client

Two choices (choose the one you prefer):

1. Use the ICNDB public API to get 3 Chuck Norris Jokes. Extract ONLY the joke text from the JSON response.

http://www.icndb.com/api/

Getting started:

```python
import requests
res = requests.get(THE_API_URL)
data = res.json()
# NEXT, EXTRACT WHAT YOU WANT FROM THE DATA
```

2. Pick a website with data of interest and scrape that data using requests and BeautifulSoup. For example, get all the `<img>` elements from a given webpage.

Getting started:

```python
import requests
from bs4 import BeautifulSoup
res = requests.get(THE_WEBSITE_URL)
soup = BeautifulSoup(res.text)
imgs = soup.select('img')
# NEXT, EXTRACT WHAT YOU WANT FROM THE DATA
```

In [None]:
import requests
res = requests.get('http://api.icndb.com/jokes/random/3')

In [None]:
data = res.json()

In [None]:
[j['joke'] for j in data['value']]

# API and/or WebApp server

Popular frameworks include:

- [Django](https://www.djangoproject.com/) (full-featured "enterprise" web framework)
- [Flask](http://flask.pocoo.org/) (popular "micro"framework)