# Pandas basics

Can we see `countries.csv` and `continent_facts.csv`? Let's check with `ls`.

> The `!` allows you to run command line commands in a notebook

In [6]:
!ls

01-pandas basics.ipynb continent_facts.csv
Do Now.ipynb           countries.csv


If you don't see `countries.csv` etc, find out where the notebook is with `pwd` and move those files to that location.

In [2]:
!pwd

/Users/soma/Library/CloudStorage/Dropbox/Soma/Curriculum/2024-lede/06-pandas-git-github/06-classwork-inclass


## Let's install pandas!

`--quiet` makes things quiet, the opposite of quiet is... "verbose" is the word you're looking for if you want EVEN MORE details

`pip install` you only run once, it goes out on the internet and downloads pandas and then installs it somewhere on your computer.

In [11]:
!pip install --quiet pandas


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.0.1[0m[39;49m -> [0m[32;49m24.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


`import` says "I installed this on my computer somewhere, bring it into this file/notebook/whatever so I can use it here"

In [12]:
# This means "I want to use requests in this notebook/file"
import requests

If you get ModuleNotFound for `pandas` after using `!pip install` it's because your Python on the command line is NOT the same Python as your Jupyter Python. We *could* fix it permanently, or for now, you could just use `%pip install pandas` instead.

In [18]:
import pandas as pd

In [107]:
# Use pd to read in the csv "countries.csv"
# this is called a dataframe
df = pd.read_csv("countries.csv")
# Let's look at the first five
df.head()

Unnamed: 0,country,continent,life_expectancy,population,gdp
0,Afghanistan,Asia,54.863,22856302,15153728226
1,Albania,Europe,74.2,3071856,12886435920
2,Algeria,Africa,68.963,30533827,155661450046
3,Angola,Africa,45.234,13926373,34063908358
4,Antigua and Barbuda,N. America,73.544,77656,989182128


In [108]:
# Show me the last five rows
# two approaches:
# guess what the opposite of .head() is
# use tab

In [109]:
df.tail()

Unnamed: 0,country,continent,life_expectancy,population,gdp
183,Vietnam,Asia,73.777,78758010,124201381770
184,West Bank and Gaza,Asia,70.929,3198560,24264276160
185,"Yemen, Rep.",Asia,60.404,17723186,39292303362
186,Zambia,Africa,41.802,10201562,10558616670
187,Zimbabwe,Africa,43.976,12509477,9319560365


In [110]:
# What is last() for?
# hold down shift + tab to get help documentation
# df.last

In [111]:
# If you want the first three columns
df.head()

Unnamed: 0,country,continent,life_expectancy,population,gdp
0,Afghanistan,Asia,54.863,22856302,15153728226
1,Albania,Europe,74.2,3071856,12886435920
2,Algeria,Africa,68.963,30533827,155661450046
3,Angola,Africa,45.234,13926373,34063908358
4,Antigua and Barbuda,N. America,73.544,77656,989182128


**Above** it's pretty, it's a dataframe

**Below** it's ugly, it's just one column, it's a Series

In [112]:
# Try to guess how to get the life expectancy of 
# every! single! country!
# I promise it's easy and it's what you hope it is
df['life_expectancy']

0      54.863
1      74.200
2      68.963
3      45.234
4      73.544
        ...  
183    73.777
184    70.929
185    60.404
186    41.802
187    43.976
Name: life_expectancy, Length: 188, dtype: float64

In [113]:
# hey pandas!
# give me a dataframe
# give me the life expectancy column
df['life_expectancy']

0      54.863
1      74.200
2      68.963
3      45.234
4      73.544
        ...  
183    73.777
184    70.929
185    60.404
186    41.802
187    43.976
Name: life_expectancy, Length: 188, dtype: float64

In [114]:
df.sort_values('life_expectancy')

Unnamed: 0,country,continent,life_expectancy,population,gdp
147,Sierra Leone,Africa,38.123,4143115,2141990455
186,Zambia,Africa,41.802,10201562,10558616670
31,Central African Rep.,Africa,43.727,3701607,2820624534
187,Zimbabwe,Africa,43.976,12509477,9319560365
3,Angola,Africa,45.234,13926373,34063908358
...,...,...,...,...,...
160,Sweden,Europe,79.840,8860153,253754781920
7,Australia,Oceania,79.930,19164351,560384787591
161,Switzerland,Europe,79.990,7167908,246475684488
72,"Hong Kong, China",Asia,80.361,6783317,203309577124


In [115]:
df.sort_values('life_expectancy', ascending=False)

Unnamed: 0,country,continent,life_expectancy,population,gdp
83,Japan,Asia,81.350,125720310,3590446333290
72,"Hong Kong, China",Asia,80.361,6783317,203309577124
161,Switzerland,Europe,79.990,7167908,246475684488
7,Australia,Oceania,79.930,19164351,560384787591
160,Sweden,Europe,79.840,8860153,253754781920
...,...,...,...,...,...
3,Angola,Africa,45.234,13926373,34063908358
187,Zimbabwe,Africa,43.976,12509477,9319560365
31,Central African Rep.,Africa,43.727,3701607,2820624534
186,Zambia,Africa,41.802,10201562,10558616670


In [116]:
df[['country', 'life_expectancy']]

Unnamed: 0,country,life_expectancy
0,Afghanistan,54.863
1,Albania,74.200
2,Algeria,68.963
3,Angola,45.234
4,Antigua and Barbuda,73.544
...,...,...
183,Vietnam,73.777
184,West Bank and Gaza,70.929
185,"Yemen, Rep.",60.404
186,Zambia,41.802


In [117]:
df.sort_values('life_expectancy', ascending=False).head(10)

Unnamed: 0,country,continent,life_expectancy,population,gdp
83,Japan,Asia,81.35,125720310,3590446333290
72,"Hong Kong, China",Asia,80.361,6783317,203309577124
161,Switzerland,Europe,79.99,7167908,246475684488
7,Australia,Oceania,79.93,19164351,560384787591
160,Sweden,Europe,79.84,8860153,253754781920
81,Italy,Europe,79.73,56986329,1547748695640
74,Iceland,Europe,79.72,281210,8743381320
29,Canada,N. America,79.41,30667365,995094659520
155,Spain,Europe,79.34,40288457,943152778370
57,France,Europe,79.23,59047795,1690892657620


In [118]:
# never use inplace=True

In [119]:
# Can I udo .head().tail()
# df -> Give me the dataframe
# .head() -> give me the first five (of the datframe)
# .tail() -> give me the last five (of the first five)
df.head().tail()

Unnamed: 0,country,continent,life_expectancy,population,gdp
0,Afghanistan,Asia,54.863,22856302,15153728226
1,Albania,Europe,74.2,3071856,12886435920
2,Algeria,Africa,68.963,30533827,155661450046
3,Angola,Africa,45.234,13926373,34063908358
4,Antigua and Barbuda,N. America,73.544,77656,989182128


In [120]:
# sort everything in the dataframe by life expectancy
# then give me the first ten
df.sort_values('life_expectancy', ascending=False).head(10)

# give me the first ten rows
# then sort them by life expectancy
df.head(10).sort_values('life_expectancy', ascending=False)

Unnamed: 0,country,continent,life_expectancy,population,gdp
7,Australia,Oceania,79.93,19164351,560384787591
8,Austria,Europe,78.33,8004712,256214821696
1,Albania,Europe,74.2,3071856,12886435920
5,Argentina,S. America,73.822,36930709,390394524839
4,Antigua and Barbuda,N. America,73.544,77656,989182128
6,Armenia,Europe,71.494,3076098,6502871172
2,Algeria,Africa,68.963,30533827,155661450046
9,Azerbaijan,Europe,66.851,8110723,20544461359
0,Afghanistan,Asia,54.863,22856302,15153728226
3,Angola,Africa,45.234,13926373,34063908358


In [121]:
# give me the row with the index of 2
df.loc[2]

country                 Algeria
continent                Africa
life_expectancy          68.963
population             30533827
gdp                155661450046
Name: 2, dtype: object

In [122]:
df['continent']

0            Asia
1          Europe
2          Africa
3          Africa
4      N. America
          ...    
183          Asia
184          Asia
185          Asia
186        Africa
187        Africa
Name: continent, Length: 188, dtype: object

In [123]:
df[df['life_expectancy'] > 80]

Unnamed: 0,country,continent,life_expectancy,population,gdp
72,"Hong Kong, China",Asia,80.361,6783317,203309577124
83,Japan,Asia,81.35,125720310,3590446333290


In [124]:
# Entry country in Asia with a life expectancy over 75
# df['life_expectancy'] > 75
# df['continent'] == 'Asia'

# df['life_expectancy'] > 75 and df['continent'] == 'Asia'
# we have to do this, sadly :(
# (df['life_expectancy'] > 75) & (df['continent'] == 'Asia')
df[(df['life_expectancy'] > 75) & (df['continent'] == 'Asia')]

Unnamed: 0,country,continent,life_expectancy,population,gdp
23,Brunei,Asia,75.927,327036,15704268720
72,"Hong Kong, China",Asia,80.361,6783317,203309577124
80,Israel,Asia,78.75,6014953,137345436802
83,Japan,Asia,81.35,125720310,3590446333290
89,"Korea, Rep.",Asia,76.114,45987624,781559669880
100,"Macao, China",Asia,77.627,431867,9722189904
135,Qatar,Asia,76.68,590957,35690257058
148,Singapore,Asia,78.34,3919300,144363496200
163,Taiwan,Asia,76.02,22183000,521855075000


In [125]:
# This seems easier to read to me???? but no one does it :(
df.query('life_expectancy > 75 and continent == "Asia"')

Unnamed: 0,country,continent,life_expectancy,population,gdp
23,Brunei,Asia,75.927,327036,15704268720
72,"Hong Kong, China",Asia,80.361,6783317,203309577124
80,Israel,Asia,78.75,6014953,137345436802
83,Japan,Asia,81.35,125720310,3590446333290
89,"Korea, Rep.",Asia,76.114,45987624,781559669880
100,"Macao, China",Asia,77.627,431867,9722189904
135,Qatar,Asia,76.68,590957,35690257058
148,Singapore,Asia,78.34,3919300,144363496200
163,Taiwan,Asia,76.02,22183000,521855075000


In [126]:
df[df['country'] == 'Mexico']


Unnamed: 0,country,continent,life_expectancy,population,gdp
110,Mexico,N. America,74.38,99959594,1088959817036


In [127]:
# Filter for everything that is in Asia
# df['continent'] == 'Asia'
df[df['continent'] == 'Asia']

Unnamed: 0,country,continent,life_expectancy,population,gdp
0,Afghanistan,Asia,54.863,22856302,15153728226
11,Bahrain,Asia,74.497,638193,14049818895
12,Bangladesh,Asia,65.309,129592275,139311695625
18,Bhutan,Asia,60.307,571262,1669227564
23,Brunei,Asia,75.927,327036,15704268720
27,Cambodia,Asia,62.03,12446949,12222903918
34,China,Asia,72.124,1269116737,4323880722959
72,"Hong Kong, China",Asia,80.361,6783317,203309577124
75,India,Asia,62.129,1053898107,1736824080336
76,Indonesia,Asia,67.289,213395411,579155145454


In [128]:
df.sort_values('life_expectancy', ascending=False) \
    .head(10) \
    [['country', 'life_expectancy']]


Unnamed: 0,country,life_expectancy
83,Japan,81.35
72,"Hong Kong, China",80.361
161,Switzerland,79.99
7,Australia,79.93
160,Sweden,79.84
81,Italy,79.73
74,Iceland,79.72
29,Canada,79.41
155,Spain,79.34
57,France,79.23


In [129]:
(
    df.sort_values('life_expectancy', ascending=False)
        .head(10)[['country', 'life_expectancy']]
)

Unnamed: 0,country,life_expectancy
83,Japan,81.35
72,"Hong Kong, China",80.361
161,Switzerland,79.99
7,Australia,79.93
160,Sweden,79.84
81,Italy,79.73
74,Iceland,79.72
29,Canada,79.41
155,Spain,79.34
57,France,79.23


In [130]:
# .average?
# .avg?
# .....?
df['life_expectancy'].mean()

66.50153603526596

In [131]:
# Complain complain complain, mean is easily skewed by outliers
# I want a better measure of central tendencies
df['life_expectancy'].median()

70.04150000000001

In [132]:
df['life_expectancy'].describe()

count    188.000000
mean      66.501536
std       10.298458
min       38.123000
25%       59.663750
50%       70.041500
75%       74.134500
max       81.350000
Name: life_expectancy, dtype: float64

In [133]:
df['gdp'].sum()

48316204002444

In [134]:
df.head(2)

Unnamed: 0,country,continent,life_expectancy,population,gdp
0,Afghanistan,Asia,54.863,22856302,15153728226
1,Albania,Europe,74.2,3071856,12886435920


In [135]:
df[['life_expectancy', 'gdp']].corr()

Unnamed: 0,life_expectancy,gdp
life_expectancy,1.0,0.209858
gdp,0.209858,1.0


In [136]:
# calculate the per capita gdp, gdp per capita
# gdp / population
df['gdp'] / df['population']

0        663.0
1       4195.0
2       5098.0
3       2446.0
4      12738.0
        ...   
183     1577.0
184     7586.0
185     2217.0
186     1035.0
187      745.0
Length: 188, dtype: float64

In [137]:
# Try to save this as a column??????
df['gdp_per_capita'] = df['gdp'] / df['population']
df.head(2)

Unnamed: 0,country,continent,life_expectancy,population,gdp,gdp_per_capita
0,Afghanistan,Asia,54.863,22856302,15153728226,663.0
1,Albania,Europe,74.2,3071856,12886435920,4195.0


In [138]:
# Step 1: df
# Step 2: filters
# Step 3: groupby
# Step 4: column(s)
# Step 5: the math

df.groupby('continent')['life_expectancy'].median()

continent
Africa        52.1025
Asia          70.4275
Europe        75.5800
N. America    70.5765
Oceania       67.4955
S. America    70.7660
Name: life_expectancy, dtype: float64

In [139]:
5 * 5

25

In [140]:
df.head()

Unnamed: 0,country,continent,life_expectancy,population,gdp,gdp_per_capita
0,Afghanistan,Asia,54.863,22856302,15153728226,663.0
1,Albania,Europe,74.2,3071856,12886435920,4195.0
2,Algeria,Africa,68.963,30533827,155661450046,5098.0
3,Angola,Africa,45.234,13926373,34063908358,2446.0
4,Antigua and Barbuda,N. America,73.544,77656,989182128,12738.0


In [141]:
south_america_df = df[df['continent'] == 'S. America']
south_america_df.head(2)

Unnamed: 0,country,continent,life_expectancy,population,gdp,gdp_per_capita
5,Argentina,S. America,73.822,36930709,390394524839,10571.0
19,Bolivia,S. America,62.994,8307248,28194799712,3394.0


In [142]:
south_america_df['population'] = 0

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  south_america_df['population'] = 0


In [143]:
south_america_df

Unnamed: 0,country,continent,life_expectancy,population,gdp,gdp_per_capita
5,Argentina,S. America,73.822,0,390394524839,10571.0
19,Bolivia,S. America,62.994,0,28194799712,3394.0
22,Brazil,S. America,70.261,0,1405170917672,8056.0
33,Chile,S. America,77.01,0,160905821700,10435.0
35,Colombia,S. America,71.026,0,228604190334,5749.0
48,Ecuador,S. America,73.353,0,69329649168,5616.0
69,Guyana,S. America,63.29,0,2302670241,3141.0
130,Paraguay,S. America,70.073,0,20877206873,3907.0
131,Peru,S. America,70.506,0,149766187617,5791.0
158,Suriname,S. America,67.851,0,2650284742,5677.0


In [144]:
# df[df['continent'] == 'S. America']

In [145]:
df[df['continent'] == 'S. America']

Unnamed: 0,country,continent,life_expectancy,population,gdp,gdp_per_capita
5,Argentina,S. America,73.822,36930709,390394524839,10571.0
19,Bolivia,S. America,62.994,8307248,28194799712,3394.0
22,Brazil,S. America,70.261,174425387,1405170917672,8056.0
33,Chile,S. America,77.01,15419820,160905821700,10435.0
35,Colombia,S. America,71.026,39764166,228604190334,5749.0
48,Ecuador,S. America,73.353,12345023,69329649168,5616.0
69,Guyana,S. America,63.29,733101,2302670241,3141.0
130,Paraguay,S. America,70.073,5343539,20877206873,3907.0
131,Peru,S. America,70.506,25861887,149766187617,5791.0
158,Suriname,S. America,67.851,466846,2650284742,5677.0


In [151]:
df['continent'].value_counts()

continent
Africa        54
Asia          48
Europe        42
N. America    22
S. America    12
Oceania       10
Name: count, dtype: int64

In [148]:
df['continent'].describe()

count        188
unique         6
top       Africa
freq          54
Name: continent, dtype: object

In [152]:
df['continent'].nunique()

6

In [153]:
df['continent'].unique()

array(['Asia', 'Europe', 'Africa', 'N. America', 'S. America', 'Oceania'],
      dtype=object)

## Reading stuff from the internet???

In [None]:
# ModuleNotFoundError lxml
%pip install lxml

In [16]:
import pandas as pd

url = "https://www.theguardian.com/football/laligafootball/table"
# Go onto the internet and look at that web page
# get all of the tables from it and save them in
# the variable tables
tables = pd.read_html(url)

# and now we'll count the number of tables you found
len(tables)

1

In [24]:
df = tables[0]
df.head()

Unnamed: 0,P,Team,GP,W,D,L,F,A,GD,Pts,Form
0,1,Real Madrid,38,29,8,1,87,26,61,95,Won 3-0 against Cadiz Won 4-0 against Granada...
1,2,Barcelona,38,26,7,5,79,44,35,85,Lost 2-4 to Girona Won 2-0 against Real Socie...
2,3,Girona,38,25,6,7,85,46,39,81,Won 4-2 against Barcelona Drew 2-2 with Alave...
3,4,Atlético,38,24,4,10,70,43,27,76,Won 1-0 against Mallorca Won 1-0 against Celt...
4,5,A Bilbao,38,19,11,8,61,37,24,68,Won 2-0 against Getafe Drew 2-2 with Osasuna ...


In [25]:
df.head()

Unnamed: 0,P,Team,GP,W,D,L,F,A,GD,Pts,Form
0,1,Real Madrid,38,29,8,1,87,26,61,95,Won 3-0 against Cadiz Won 4-0 against Granada...
1,2,Barcelona,38,26,7,5,79,44,35,85,Lost 2-4 to Girona Won 2-0 against Real Socie...
2,3,Girona,38,25,6,7,85,46,39,81,Won 4-2 against Barcelona Drew 2-2 with Alave...
3,4,Atlético,38,24,4,10,70,43,27,76,Won 1-0 against Mallorca Won 1-0 against Celt...
4,5,A Bilbao,38,19,11,8,61,37,24,68,Won 2-0 against Getafe Drew 2-2 with Osasuna ...


In [27]:
# Always index=False
df.to_csv("teams.csv", index=False)

In [23]:
# pd.read_csv("teams.csv")

Unnamed: 0.5,Unnamed: 0.4,Unnamed: 0.3,Unnamed: 0.2,Unnamed: 0.1,Unnamed: 0,P,Team,GP,W,D,L,F,A,GD,Pts,Form
0,0,0,0,0,0,1,Real Madrid,38,29,8,1,87,26,61,95,Won 3-0 against Cadiz Won 4-0 against Granada...
1,1,1,1,1,1,2,Barcelona,38,26,7,5,79,44,35,85,Lost 2-4 to Girona Won 2-0 against Real Socie...
2,2,2,2,2,2,3,Girona,38,25,6,7,85,46,39,81,Won 4-2 against Barcelona Drew 2-2 with Alave...
3,3,3,3,3,3,4,Atlético,38,24,4,10,70,43,27,76,Won 1-0 against Mallorca Won 1-0 against Celt...
4,4,4,4,4,4,5,A Bilbao,38,19,11,8,61,37,24,68,Won 2-0 against Getafe Drew 2-2 with Osasuna ...
5,5,5,5,5,5,6,Real Sociedad,38,16,12,10,51,39,12,60,Won 2-0 against Las Palmas Lost 0-2 to Barcel...
6,6,6,6,6,6,7,Real Betis,38,14,15,9,48,45,3,57,Won 2-0 against Osasuna Won 3-2 against Almer...
7,7,7,7,7,7,8,Villarreal,38,14,11,13,65,65,0,53,Lost 2-3 to Celta Vigo Won 3-2 against Sevill...
8,8,8,8,8,8,9,Valencia,38,13,10,15,40,45,-5,49,Lost 0-1 to Alaves Drew 0-0 with Rayo Valleca...
9,9,9,9,9,9,10,Alaves,38,12,10,16,36,46,-10,46,Won 1-0 against Valencia Drew 2-2 with Girona...


In [8]:
# tables = pd.read_html("http://www.bigpumpkins.com/WeighoffResultsGPC.aspx?c=L&y=2023")
# len(tables)

4

In [10]:
# tables[0].tail()

Unnamed: 0,Place,Weight (lbs),Grower Name,City,State/Prov,Country,GPC Site,Seed (Mother),Pollinator (Father),OTT,Est. Weight,Pct. Chart
261,,This page shows the GPC results for the select...,This page shows the GPC results for the select...,,,,,,,,,
262,,PlaceWeight (lbs)Grower NameCityState/ProvCoun...,PlaceWeight (lbs)Grower NameCityState/ProvCoun...,,,,,,,,,
263,,,,,,,,,,,,
264,Top of Page Questions or comments? Send mail t...,Top of Page Questions or comments? Send mail t...,Top of Page Questions or comments? Send mail t...,,,,,,,,,
265,"259 Entries. (17 exhibition only, 4 damaged)","259 Entries. (17 exhibition only, 4 damaged)","259 Entries. (17 exhibition only, 4 damaged)","259 Entries. (17 exhibition only, 4 damaged)","259 Entries. (17 exhibition only, 4 damaged)","259 Entries. (17 exhibition only, 4 damaged)","259 Entries. (17 exhibition only, 4 damaged)","259 Entries. (17 exhibition only, 4 damaged)","259 Entries. (17 exhibition only, 4 damaged)","259 Entries. (17 exhibition only, 4 damaged)","259 Entries. (17 exhibition only, 4 damaged)","259 Entries. (17 exhibition only, 4 damaged)"


## API stuff??? JSON stuff???

In [28]:
import requests

url = "http://api.weatherapi.com/v1/forecast.json?key=1deea713c0524e4abd621042220711&q=London&days=3&aqi=yes&alerts=no"
response = requests.get(url)
data = response.json()

In [30]:
data.keys()

dict_keys(['location', 'current', 'forecast'])

In [32]:
data['forecast']['forecastday']

[{'date': '2024-06-12',
  'date_epoch': 1718150400,
  'day': {'maxtemp_c': 15.9,
   'maxtemp_f': 60.7,
   'mintemp_c': 7.7,
   'mintemp_f': 45.8,
   'avgtemp_c': 12.3,
   'avgtemp_f': 54.1,
   'maxwind_mph': 7.4,
   'maxwind_kph': 11.9,
   'totalprecip_mm': 1.54,
   'totalprecip_in': 0.06,
   'totalsnow_cm': 0.0,
   'avgvis_km': 9.1,
   'avgvis_miles': 5.0,
   'avghumidity': 71,
   'daily_will_it_rain': 1,
   'daily_chance_of_rain': 87,
   'daily_will_it_snow': 0,
   'daily_chance_of_snow': 0,
   'condition': {'text': 'Patchy rain nearby',
    'icon': '//cdn.weatherapi.com/weather/64x64/day/176.png',
    'code': 1063},
   'uv': 3.0},
  'astro': {'sunrise': '04:43 AM',
   'sunset': '09:18 PM',
   'moonrise': '10:45 AM',
   'moonset': '01:03 AM',
   'moon_phase': 'Waxing Crescent',
   'moon_illumination': 29,
   'is_moon_up': 1,
   'is_sun_up': 0},
  'hour': [{'time_epoch': 1718146800,
    'time': '2024-06-12 00:00',
    'temp_c': 8.7,
    'temp_f': 47.7,
    'is_day': 0,
    'condition'

In [33]:
pd.DataFrame(data['forecast']['forecastday'])

Unnamed: 0,date,date_epoch,day,astro,hour
0,2024-06-12,1718150400,"{'maxtemp_c': 15.9, 'maxtemp_f': 60.7, 'mintem...","{'sunrise': '04:43 AM', 'sunset': '09:18 PM', ...","[{'time_epoch': 1718146800, 'time': '2024-06-1..."
1,2024-06-13,1718236800,"{'maxtemp_c': 15.8, 'maxtemp_f': 60.5, 'mintem...","{'sunrise': '04:43 AM', 'sunset': '09:19 PM', ...","[{'time_epoch': 1718233200, 'time': '2024-06-1..."
2,2024-06-14,1718323200,"{'maxtemp_c': 18.7, 'maxtemp_f': 65.6, 'mintem...","{'sunrise': '04:42 AM', 'sunset': '09:19 PM', ...","[{'time_epoch': 1718319600, 'time': '2024-06-1..."


In [39]:
pd.json_normalize(data['forecast']['forecastday'])

Unnamed: 0,date,date_epoch,hour,day.maxtemp_c,day.maxtemp_f,day.mintemp_c,day.mintemp_f,day.avgtemp_c,day.avgtemp_f,day.maxwind_mph,...,day.condition.code,day.uv,astro.sunrise,astro.sunset,astro.moonrise,astro.moonset,astro.moon_phase,astro.moon_illumination,astro.is_moon_up,astro.is_sun_up
0,2024-06-12,1718150400,"[{'time_epoch': 1718146800, 'time': '2024-06-1...",15.9,60.7,7.7,45.8,12.3,54.1,7.4,...,1063,3.0,04:43 AM,09:18 PM,10:45 AM,01:03 AM,Waxing Crescent,29,1,0
1,2024-06-13,1718236800,"[{'time_epoch': 1718233200, 'time': '2024-06-1...",15.8,60.5,8.4,47.0,12.2,54.0,14.3,...,1063,4.0,04:43 AM,09:19 PM,11:57 AM,01:15 AM,Waxing Crescent,38,1,0
2,2024-06-14,1718323200,"[{'time_epoch': 1718319600, 'time': '2024-06-1...",18.7,65.6,11.9,53.5,14.6,58.3,15.2,...,1063,6.0,04:42 AM,09:19 PM,01:07 PM,01:26 AM,First Quarter,48,1,0


In [37]:
pd.json_normalize(data['forecast']['forecastday'][0]['hour'])

Unnamed: 0,time_epoch,time,temp_c,temp_f,is_day,wind_mph,wind_kph,wind_degree,wind_dir,pressure_mb,...,will_it_snow,chance_of_snow,vis_km,vis_miles,gust_mph,gust_kph,uv,condition.text,condition.icon,condition.code
0,1718146800,2024-06-12 00:00,8.7,47.7,0,3.4,5.4,24,NNE,1021.0,...,0,0,10.0,6.0,5.9,9.5,1.0,Overcast,//cdn.weatherapi.com/weather/64x64/night/122.png,1009
1,1718150400,2024-06-12 01:00,8.2,46.8,0,3.1,5.0,31,NNE,1021.0,...,0,0,10.0,6.0,5.5,8.9,1.0,Clear,//cdn.weatherapi.com/weather/64x64/night/113.png,1000
2,1718154000,2024-06-12 02:00,7.9,46.3,0,2.5,4.0,33,NNE,1021.0,...,0,0,10.0,6.0,4.3,6.9,1.0,Clear,//cdn.weatherapi.com/weather/64x64/night/113.png,1000
3,1718157600,2024-06-12 03:00,7.9,46.2,0,1.8,2.9,27,NNE,1021.0,...,0,0,2.0,1.0,3.3,5.3,1.0,Mist,//cdn.weatherapi.com/weather/64x64/night/143.png,1030
4,1718161200,2024-06-12 04:00,7.7,45.8,0,1.6,2.5,32,NNE,1022.0,...,0,0,2.0,1.0,2.9,4.6,1.0,Mist,//cdn.weatherapi.com/weather/64x64/night/143.png,1030
5,1718164800,2024-06-12 05:00,8.5,47.3,1,0.7,1.1,49,NE,1022.0,...,0,0,10.0,6.0,1.1,1.7,3.0,Partly Cloudy,//cdn.weatherapi.com/weather/64x64/day/116.png,1003
6,1718168400,2024-06-12 06:00,9.6,49.3,1,0.4,0.7,354,N,1022.0,...,0,0,10.0,6.0,0.6,1.0,2.0,Patchy rain nearby,//cdn.weatherapi.com/weather/64x64/day/176.png,1063
7,1718172000,2024-06-12 07:00,11.2,52.1,1,1.1,1.8,11,NNE,1022.0,...,0,0,5.0,3.0,1.4,2.2,3.0,Patchy light drizzle,//cdn.weatherapi.com/weather/64x64/day/263.png,1150
8,1718175600,2024-06-12 08:00,12.2,53.9,1,0.7,1.1,12,NNE,1022.0,...,0,0,10.0,6.0,0.8,1.3,3.0,Light rain shower,//cdn.weatherapi.com/weather/64x64/day/353.png,1240
9,1718179200,2024-06-12 09:00,13.0,55.5,1,0.9,1.4,246,WSW,1021.0,...,0,0,10.0,6.0,1.1,1.8,3.0,Light rain shower,//cdn.weatherapi.com/weather/64x64/day/353.png,1240
