## A practical case

> - Retrieve information from an url
> - and convert it into a DataFrame
> - to operate with the Data

### Retrieve the Information from an `url`

https://github.com/jsulopz/data

> - Find the `function()` that gets the content from an `url`

In [36]:
import requests

res = requests.get('https://raw.githubusercontent.com/jsulopz/data/main/football_players_stats.json')
res

<Response [200]>

> - Is the object just `<Response [200]>`
> - Or may it contain more information/data?

In [42]:
res.

SyntaxError: invalid syntax (1459325066.py, line 1)

> - How can you access the data we see [here](https://raw.githubusercontent.com/jsulopz/data/main/best_tennis_players_stats.json)

In [43]:
res.content

b'{"income":{"roger":130,"rafa":127,"nole":154},"titles":{"roger":103,"rafa":90,"nole":86},"grand slams":{"roger":20,"rafa":21,"nole":20},"turned professional":{"roger":1998,"rafa":2001,"nole":2003},"wins":{"roger":1251,"rafa":1038,"nole":989},"losses":{"roger":275,"rafa":209,"nole":199}}'

In [44]:
pd.DataFrame(res.content)

ValueError: DataFrame constructor not properly called!

In [45]:
pd.DataFrame('{"income":{"roger":130,"rafa":127,"nole":154},"titles":{"roger":103,"rafa":90,"nole":86},"grand slams":{"roger":20,"rafa":21,"nole":20},"turned professional":{"roger":1998,"rafa":2001,"nole":2003},"wins":{"roger":1251,"rafa":1038,"nole":989},"losses":{"roger":275,"rafa":209,"nole":199}}')

ValueError: DataFrame constructor not properly called!

In [46]:
pd.DataFrame('{"nombres": ["juan", "pepe"], "peso": [67,45]}')

ValueError: DataFrame constructor not properly called!

In [47]:
pd.DataFrame({"nombres": ["juan", "pepe"], "peso": [67,45]})

Unnamed: 0,nombres,peso
0,juan,67
1,pepe,45


> - Is there a way to get the data from the `url`
> - just like ↓

In [48]:
{"nombres": ["juan", "pepe"], "peso": [67,45]}

{'nombres': ['juan', 'pepe'], 'peso': [67, 45]}

> - and not this ↓

In [49]:
b'{"nombres": ["juan", "pepe"], "peso": [67,45]}'

b'{"nombres": ["juan", "pepe"], "peso": [67,45]}'

> - Apply the discipline to find a `function()` within the object

In [50]:
res.json()

{'income': {'roger': 130, 'rafa': 127, 'nole': 154},
 'titles': {'roger': 103, 'rafa': 90, 'nole': 86},
 'grand slams': {'roger': 20, 'rafa': 21, 'nole': 20},
 'turned professional': {'roger': 1998, 'rafa': 2001, 'nole': 2003},
 'wins': {'roger': 1251, 'rafa': 1038, 'nole': 989},
 'losses': {'roger': 275, 'rafa': 209, 'nole': 199}}

In [51]:
pd.DataFrame(res.json())

Unnamed: 0,income,titles,grand slams,turned professional,wins,losses
roger,130,103,20,1998,1251,275
rafa,127,90,21,2001,1038,209
nole,154,86,20,2003,989,199


### Recap

In [52]:
res = requests.get(url='https://raw.githubusercontent.com/jsulopz/data/main/best_tennis_players_stats.json')

In [53]:
res.content

b'{"income":{"roger":130,"rafa":127,"nole":154},"titles":{"roger":103,"rafa":90,"nole":86},"grand slams":{"roger":20,"rafa":21,"nole":20},"turned professional":{"roger":1998,"rafa":2001,"nole":2003},"wins":{"roger":1251,"rafa":1038,"nole":989},"losses":{"roger":275,"rafa":209,"nole":199}}'

In [54]:
pd.DataFrame(res.content)

ValueError: DataFrame constructor not properly called!

In [55]:
res.json()

{'income': {'roger': 130, 'rafa': 127, 'nole': 154},
 'titles': {'roger': 103, 'rafa': 90, 'nole': 86},
 'grand slams': {'roger': 20, 'rafa': 21, 'nole': 20},
 'turned professional': {'roger': 1998, 'rafa': 2001, 'nole': 2003},
 'wins': {'roger': 1251, 'rafa': 1038, 'nole': 989},
 'losses': {'roger': 275, 'rafa': 209, 'nole': 199}}

In [56]:
pd.DataFrame(res.json())

Unnamed: 0,income,titles,grand slams,turned professional,wins,losses
roger,130,103,20,1998,1251,275
rafa,127,90,21,2001,1038,209
nole,154,86,20,2003,989,199


### Shouldn't it be easier?

> - Apply the discipline to find `function()` within some library

In [57]:
pd.read_json('https://raw.githubusercontent.com/jsulopz/data/main/best_tennis_players_stats.json')

Unnamed: 0,income,titles,grand slams,turned professional,wins,losses
roger,130,103,20,1998,1251,275
rafa,127,90,21,2001,1038,209
nole,154,86,20,2003,989,199


In [58]:
df = pd.read_json('https://raw.githubusercontent.com/jsulopz/data/main/best_tennis_players_stats.json')

In [59]:
df

Unnamed: 0,income,titles,grand slams,turned professional,wins,losses
roger,130,103,20,1998,1251,275
rafa,127,90,21,2001,1038,209
nole,154,86,20,2003,989,199


> - And now calculate the `sum()` of the `income`

In [60]:
df.income.sum()

411