# INTRODUCTION TO PYTHON FOR DATA SCIENCE

## _Python Fundamentals through Examples_

## EIPA
online, September 18 - 22, 2023

### [Dr. Christian Kauth](https://www.linkedin.com/in/ckauth/)

# Functions

<img src="https://images.unsplash.com/photo-1586473219010-2ffc57b0d282?ixlib=rb-1.2.1&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=1364&q=80" alt="crossroad" height="600px"/>

# Documentation

- [python docs](https://docs.python.org/)
- [w3schools](https://www.w3schools.com/python)

# Setup

## Some Randomness, for Fun

In [1]:
from IPython.display import Markdown as md
import os
import numpy as np
import pandas as pd

import random
random.seed(0) # pick your seed

## Data

In [2]:
!pip install eurostatapiclient

Collecting eurostatapiclient
  Downloading eurostatapiclient-0.3.0-py3-none-any.whl (12 kB)
Installing collected packages: eurostatapiclient
Successfully installed eurostatapiclient-0.3.0


In [3]:
from eurostatapiclient import EurostatAPIClient

#Set versions and formats, so far only the ones used here are availeable and call client
VERSION = '1.0'
FORMAT = 'json'
LANGUAGE = 'en'
client = EurostatAPIClient(VERSION, FORMAT, LANGUAGE)

In [4]:
%%html
<iframe src="https://ec.europa.eu/eurostat/databrowser/view/pat_ep_nipc/default/table?lang=en" width="1000" height="800"></iframe>

In [5]:
countries_names = {'AT':'Austria', 'BE':'Belgium', 'BG':'Bulgaria', 'CY': 'Cyprus',
                   'CZ': 'Czechia', 'DE': 'Germany', 'DK': 'Denmark', 'EE':'Estonia',
                   'EL': 'Greece', 'ES':'Spain', 'FI':'Finland', 'FR':'France',
                   'HR':'Croatia', 'HU':'Hungary', 'IE':'Ireland', 'IT':'Italy',
                   'LT':'Lithuania', 'LU':'Luxembourg', 'LV':'Latvia', 'MT': 'Malta',
                   'NL':'Netherlands', 'PL':'Poland', 'PT':'Portugal', 'RO':'Romania',
                   'SE':'Sweden', 'SI':'Slovenia', 'SK':'Slovakia', 'UK':'United Kingdom'}

patent_sections = {'A': 'Human necessities',
                   'B': 'Performing operations; transporting',
                   'C': 'Chemistry; metallurgy',
                   'D': 'Textiles; paper',
                   'E': 'Fixed constructions',
                   'F': 'Mechanical engineering; lighting; heating; weapons; blasting',
                   'G': 'Physics',
                   'H': 'Electricity'}

In [6]:
par_df1 = {
    'ipc': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'],
    'unit': ['NR', 'P_MHAB'],
    'geo': list(countries_names.keys()),
}

df1 = client.get_dataset('pat_ep_nipc', params=par_df1).to_dataframe()

df1.rename(columns={'geo': 'country', 'time': 'year'}, inplace=True)
df1['year'] = df1['year'].astype('int')
df1['country'] = df1['country'].map(countries_names)
df1['ipc'] = df1['ipc'].map(patent_sections)

In [7]:
data_dir = '.'

In [8]:
filename = os.path.join(data_dir, 'pat_ep_nipc.csv')
df1.to_csv(filename, index=False)

In [9]:
print(len(df1))
print(df1.dtypes)
df1.sample(5)

16576
values     float64
freq        object
ipc         object
unit        object
country     object
year         int64
dtype: object


Unnamed: 0,values,freq,ipc,unit,country,year
5898,10.401,A,Chemistry; metallurgy,P_MHAB,Austria,1992
10768,14.82,A,Mechanical engineering; lighting; heating; wea...,NR,Italy,1978
10378,19.35,A,Mechanical engineering; lighting; heating; wea...,NR,Belgium,1995
2308,10.27,A,Performing operations; transporting,NR,Ireland,1991
15900,24.488,A,Electricity,P_MHAB,France,2004


In [10]:
df2 = df1.pivot(index='year', columns=['ipc', 'unit', 'country'], values='values')

In [11]:
df2

ipc,Human necessities,Human necessities,Human necessities,Human necessities,Human necessities,Human necessities,Human necessities,Human necessities,Human necessities,Human necessities,...,Electricity,Electricity,Electricity,Electricity,Electricity,Electricity,Electricity,Electricity,Electricity,Electricity
unit,NR,NR,NR,NR,NR,NR,NR,NR,NR,NR,...,P_MHAB,P_MHAB,P_MHAB,P_MHAB,P_MHAB,P_MHAB,P_MHAB,P_MHAB,P_MHAB,P_MHAB
country,Belgium,Bulgaria,Czechia,Denmark,Germany,Estonia,Ireland,Greece,Spain,France,...,Netherlands,Austria,Poland,Portugal,Romania,Slovenia,Slovakia,Finland,Sweden,United Kingdom
year,Unnamed: 1_level_3,Unnamed: 2_level_3,Unnamed: 3_level_3,Unnamed: 4_level_3,Unnamed: 5_level_3,Unnamed: 6_level_3,Unnamed: 7_level_3,Unnamed: 8_level_3,Unnamed: 9_level_3,Unnamed: 10_level_3,Unnamed: 11_level_3,Unnamed: 12_level_3,Unnamed: 13_level_3,Unnamed: 14_level_3,Unnamed: 15_level_3,Unnamed: 16_level_3,Unnamed: 17_level_3,Unnamed: 18_level_3,Unnamed: 19_level_3,Unnamed: 20_level_3,Unnamed: 21_level_3
1977,2.6,,,6.86,129.03,,1.0,0.17,,28.81,...,0.823,0.066,,,,,,0.211,0.743,0.517
1978,22.32,,,10.07,376.38,,0.5,,2.33,122.67,...,1.686,1.172,,,,,,0.042,1.444,2.093
1979,27.12,,,15.77,531.44,,6.0,1.08,6.58,197.51,...,5.405,2.571,0.05,,,,,0.042,3.747,3.086
1980,32.42,,,28.61,634.87,,6.83,0.33,9.45,253.21,...,4.932,2.225,0.085,,0.09,,,0.31,4.984,3.28
1981,37.78,1.0,,35.6,710.88,,5.67,2.0,10.71,359.28,...,8.203,2.787,,,,,,1.201,6.627,4.543
1982,56.49,1.6,,29.78,837.91,,6.82,1.72,12.78,397.29,...,13.406,3.92,,,,,,0.563,8.358,5.403
1983,70.59,1.47,,42.93,897.3,,7.93,0.82,17.9,388.11,...,14.562,3.191,0.102,,,,,0.847,7.11,6.699
1984,62.54,4.84,,61.22,1073.74,,8.73,1.13,11.74,433.69,...,15.975,4.414,0.027,0.017,0.022,,,0.86,7.158,6.814
1985,65.69,0.65,,61.01,1107.31,,4.8,1.67,27.25,507.86,...,17.203,5.543,0.042,,,,,2.377,8.858,7.531
1986,68.12,0.98,,61.91,1228.85,,8.97,1.24,31.99,516.97,...,19.841,4.705,0.094,0.1,,,,1.764,6.651,7.647


# Built-in Functions

In [12]:
%%html
<iframe src="https://docs.python.org/3/library/functions.html" width="1200" height="800"></iframe>

In [13]:
city = input("Where are you?")

Where are you?Lausanne


## Reminder: Shallow and Deep Copy

In [14]:
a = 0
b = a

print('-- before reassigning "a"')
print(f'a = {a}, b = {b}')
print(f'a @ {id(a)}, b @ {id(b)}')

print('\n-- after reassigning "a"')
a = 1
print(f'a = {a}, b = {b}')
print(f'a @ {id(a)}, b @ {id(b)}')

-- before reassigning "a"
a = 0, b = 0
a @ 135391251923152, b @ 135391251923152

-- after reassigning "a"
a = 1, b = 0
a @ 135391251923184, b @ 135391251923152


In [15]:
a = [1, 2]
b = a

print('-- before reassigning "a"')
print(f'a = {a}, b = {b}')
print(f'a @ {id(a)}, b @ {id(b)}')

print('\n-- after reassigning "a"')
a[0] = -1
print(f'a = {a}, b = {b}')
print(f'a @ {id(a)}, b @ {id(b)}')

-- before reassigning "a"
a = [1, 2], b = [1, 2]
a @ 135390045569472, b @ 135390045569472

-- after reassigning "a"
a = [-1, 2], b = [-1, 2]
a @ 135390045569472, b @ 135390045569472


In [16]:
a = [1, 2]
b = a.copy()

print('-- before reassigning "a"')
print(f'a = {a}, b = {b}')
print(f'a @ {id(a)}, b @ {id(b)}')

print('\n-- after reassigning "a"')
a[0] = -1
print(f'a = {a}, b = {b}')
print(f'a @ {id(a)}, b @ {id(b)}')

-- before reassigning "a"
a = [1, 2], b = [1, 2]
a @ 135390044970112, b @ 135390044970048

-- after reassigning "a"
a = [-1, 2], b = [1, 2]
a @ 135390044970112, b @ 135390044970048


In [17]:
dir(__builtin__)

['ArithmeticError',
 'AssertionError',
 'AttributeError',
 'BaseException',
 'BlockingIOError',
 'BrokenPipeError',
 'BufferError',
 'ChildProcessError',
 'ConnectionAbortedError',
 'ConnectionError',
 'ConnectionRefusedError',
 'ConnectionResetError',
 'EOFError',
 'Ellipsis',
 'EnvironmentError',
 'Exception',
 'False',
 'FileExistsError',
 'FileNotFoundError',
 'FloatingPointError',
 'GeneratorExit',
 'IOError',
 'ImportError',
 'IndentationError',
 'IndexError',
 'InterruptedError',
 'IsADirectoryError',
 'KeyError',
 'KeyboardInterrupt',
 'LookupError',
 'MemoryError',
 'ModuleNotFoundError',
 'NameError',
 'None',
 'NotADirectoryError',
 'NotImplemented',
 'NotImplementedError',
 'OSError',
 'OverflowError',
 'PermissionError',
 'ProcessLookupError',
 'RecursionError',
 'ReferenceError',
 'RuntimeError',
 'StopAsyncIteration',
 'StopIteration',
 'SyntaxError',
 'SystemError',
 'SystemExit',
 'TabError',
 'TimeoutError',
 'True',
 'TypeError',
 'UnboundLocalError',
 'UnicodeDecode

In [18]:
print("Prints its arguments", "to screen")

Prints its arguments to screen


In [19]:
len("abc")

3

In [20]:
min(0, 2, *[1, 2, 3, -4], 88)

-4

In [21]:
min("luxembourg", "Sweden")

'Sweden'

In [22]:
abs(-100)

100

In [23]:
round(3.1415926)

3

In [24]:
sorted("Netherlands")

['N', 'a', 'd', 'e', 'e', 'h', 'l', 'n', 'r', 's', 't']

In [25]:
type(sorted("Netherlands"))

list

## 🧑‍💻 Exercise
Use the _help_ function to learn more about one of these build-in functions. Then try to use one of its optional parameters!

In [26]:
# your code here

Comment: R in Python

In [27]:
import rpy2.robjects as robjects

In [28]:
# activate R magic
%load_ext rpy2.ipython

In [29]:
%%R
a <- 1
a

[1] 1


In [30]:
#@title a few examples (solution)
print(round(3.141596, ndigits=4))
print(round(123.456, -1))
print(print(*reversed("Luxembourg"), sep='_'))
print(sorted(random.sample(range(10, 30), 10), reverse=True))
print(format(123.4567898, "f"))
print(eval("1 > 2"))
myText = "round(3.141596, ndigits=4)"
eval(myText)

3.1416
120.0
g_r_u_o_b_m_e_x_u_L
None
[29, 25, 23, 22, 18, 17, 16, 15, 14, 11]
123.456790
False


3.1416

For more details on string formatting (and there are many), have a look at [Common string operations](https://docs.python.org/3/library/string.html#string-formatting).

## Built-in Magic Commands (IPython)

In [31]:
%%html
<iframe src="https://ipython.readthedocs.io/en/stable/interactive/magics.html" width="1200" height="800"></iframe>


In [32]:
%lsmagic

Available line magics:
%R  %Rdevice  %Rget  %Rpull  %Rpush  %alias  %alias_magic  %autoawait  %autocall  %automagic  %autosave  %bookmark  %cat  %cd  %clear  %colors  %conda  %config  %connect_info  %cp  %debug  %dhist  %dirs  %doctest_mode  %ed  %edit  %env  %gui  %hist  %history  %killbgscripts  %ldir  %less  %lf  %lk  %ll  %load  %load_ext  %loadpy  %logoff  %logon  %logstart  %logstate  %logstop  %ls  %lsmagic  %lx  %macro  %magic  %man  %matplotlib  %mkdir  %more  %mv  %notebook  %page  %pastebin  %pdb  %pdef  %pdoc  %pfile  %pinfo  %pinfo2  %pip  %popd  %pprint  %precision  %prun  %psearch  %psource  %pushd  %pwd  %pycat  %pylab  %qtconsole  %quickref  %recall  %rehashx  %reload_ext  %rep  %rerun  %reset  %reset_selective  %rm  %rmdir  %run  %save  %sc  %set_env  %shell  %store  %sx  %system  %tb  %tensorflow_version  %time  %timeit  %unalias  %unload_ext  %who  %who_ls  %whos  %xdel  %xmode

Available cell magics:
%%!  %%HTML  %%R  %%SVG  %%bash  %%bigquery  %%capture  %%debug  

# Custom Functions

## Definition & Invokation

In [33]:
def greet():
  print('Welcome back, did you have a nice break?')
  #return None

In [34]:
greet()

Welcome back, did you have a nice break?


In [35]:
message = greet()
print("Message: ", message, ".")

Welcome back, did you have a nice break?
Message:  None .


In [36]:
def greet():
  return 'Welcome back, did you have a nice break?'

In [37]:
message = greet()
print("Message: ", message, ".")

Message:  Welcome back, did you have a nice break? .


## Positional Arguments

In [38]:
def personal_greet(name):
  print(f'Welcome back {name}, did you have a nice break?')

In [39]:
personal_greet('Mary')
personal_greet('John')

Welcome back Mary, did you have a nice break?
Welcome back John, did you have a nice break?


In [40]:
def personal_greet(name, break_type):
  print(f'Welcome back {name}, did you have a nice {break_type} break?')

In [41]:
personal_greet('Mary' ,'coffee')
personal_greet('John', 'lunch')
#personal_greet('John')

Welcome back Mary, did you have a nice coffee break?
Welcome back John, did you have a nice lunch break?


## Default Arguments

In [42]:
def personal_greet(name, break_type='coffee'):
  print(f'Welcome back {name}, did you have a nice {break_type} break?')

In [43]:
personal_greet('Mary' ,'coffee')
personal_greet('John', 'lunch')
personal_greet('John')
personal_greet('lunch')

Welcome back Mary, did you have a nice coffee break?
Welcome back John, did you have a nice lunch break?
Welcome back John, did you have a nice coffee break?
Welcome back lunch, did you have a nice coffee break?


In [44]:
def personal_greet(name='anonymous', break_type='coffee'):
  print(f'Welcome back {name}, did you have a nice {break_type} break?')

In [45]:
personal_greet('Mary' ,'coffee')
personal_greet('John', 'lunch')
personal_greet('John')
personal_greet()

Welcome back Mary, did you have a nice coffee break?
Welcome back John, did you have a nice lunch break?
Welcome back John, did you have a nice coffee break?
Welcome back anonymous, did you have a nice coffee break?


## Keyword Arguments

In [46]:
personal_greet(break_type='lunch')
personal_greet(break_type='lunch', name='Bob')
#personal_greet(,'lunch')

Welcome back anonymous, did you have a nice lunch break?
Welcome back Bob, did you have a nice lunch break?


## Arbitrary Arguments

Also called _variable-length argiments_

`*args` for positional arguments

`**kwargs` for keyword arguments

In [47]:
def welcome_team(team):
  print(f"Welcome to the course, {' & '.join(team)}!")

welcome_team(['Alice', 'Bob', 'Mary'])

Welcome to the course, Alice & Bob & Mary!


In [48]:
def welcome_team(*team):
  print(f"Welcome to the course, {' & '.join(team)}!")

In [49]:
welcome_team('Alice', 'Bob', 'Mary')

Welcome to the course, Alice & Bob & Mary!


In [50]:
def welcome_team(*team, **course):
  print(f"Welcome to the Python course, {' & '.join(team)}!")
  for k, v in course.items():
    print(f"{k} : {v}")

In [51]:
welcome_team('Alice', 'Bob', 'Mary', course='Python', sessions=4)

Welcome to the Python course, Alice & Bob & Mary!
course : Python
sessions : 4


In [52]:
welcome_team('Alice', 'Bob', 'Mary', course='Tidyverse', sessions=4, prerequisites='Mastery of R')

Welcome to the Python course, Alice & Bob & Mary!
course : Tidyverse
sessions : 4
prerequisites : Mastery of R


## Return Values

In [53]:
def get_personal_greet(name):
  return f'Welcome back {name}, did you have a nice break?'

In [54]:
whatsapp_message = get_personal_greet('Mary')
whatsapp_message

'Welcome back Mary, did you have a nice break?'

In [55]:
whatsapp_messages = [get_personal_greet(name) for name in ['Mary', 'John']]
whatsapp_messages

['Welcome back Mary, did you have a nice break?',
 'Welcome back John, did you have a nice break?']

In [56]:
from datetime import datetime

def get_personal_greet_and_time(name):
  message = f'Welcome back {name}, did you have a nice break?'
  timestamp = datetime.now().strftime("%A, %B %d %Y - %H:%M:%S")
  return message, timestamp

In [57]:
get_personal_greet_and_time("Mary")

('Welcome back Mary, did you have a nice break?',
 'Friday, September 22 2023 - 05:29:45')

In [58]:
from datetime import datetime

def get_personal_greet_and_time(name):
  message = f'Welcome back {name}, did you have a nice break?'
  timestamp = datetime.now().strftime("%A, %B %d %Y - %H:%M:%S")
  return {'greeting': message,
          'time': timestamp}

In [59]:
get_personal_greet_and_time("Mary")

{'greeting': 'Welcome back Mary, did you have a nice break?',
 'time': 'Friday, September 22 2023 - 05:29:45'}

In [60]:
whatsapp_messages = [get_personal_greet_and_time(name) for name in ['Mary', 'John']]
whatsapp_messages

[{'greeting': 'Welcome back Mary, did you have a nice break?',
  'time': 'Friday, September 22 2023 - 05:29:45'},
 {'greeting': 'Welcome back John, did you have a nice break?',
  'time': 'Friday, September 22 2023 - 05:29:45'}]

In [61]:
import numpy as np

def factors(val):
  divs = []
  for f in range(1, int(np.ceil(np.sqrt(val))) + 1):
    if val % f == 0:
      divs += [f, val // f]
  return sorted(list(np.unique(divs)))

factors(25)

[1, 5, 25]

In [62]:
def factors(val):
  divs = []
  for f in range(1, int(np.ceil(np.sqrt(val))) + 1):
    if val % f == 0:
      divs += [f, val // f]
  return sorted(set(divs))

factors(9)

[1, 3, 9]

In [63]:
def factors(val):
  divs = set()
  for f in range(1, int(np.ceil(np.sqrt(val))) + 1):
    if val % f == 0:
      divs.add(f)
      divs.add(val // f)
  return sorted(divs)

factors(9)

[1, 3, 9]

## Recursion

Fibonacci numbers: 1, 1, 2, 3, 5, 8, 13, 21 ...

In [64]:
def fibonacci_number(n):
  return fibonacci_number(n-1) + fibonacci_number(n-2) if n>2 else 1

In [65]:
def fibonacci_number_faster(n):
  first_fib, second_fib = 1, 1
  while (n > 2):
    first_fib, second_fib = first_fib + second_fib, first_fib
    n -= 1  # n = n - 1
  return first_fib

In [66]:
n = 8000
#print(fibonacci_number(n))
print(fibonacci_number_faster(n))

3561533204460626739768914905427460387141369539110154082973500638991885819498711815304829246223963373749873423083216889782034228521693267175594214186111978816819236959743284321273097535654614718808050244321699002512466203835566030351092652496815708455980825654877181538741827129421689128991879649533246136168998590044965735035810856774605383628378979290580539135791985063484992877932473487054068899476937399295193905527420792975902913836012199062687063537510151753758100626402591751183925883151617648375005313453493271681248233059858496951790113255897429539560654496639601132039360167542277472498901884679404509894269174519328918160745655327632006736189766801968534195725815421784083495026969542066047758885029695257263330719223956309043195653930347983496830801755572982419821881275569179922973415736010289561700699477021488635509784509168019589640190234350021673802856836365767446249424907273016689053388000785637444921523414602360860001530139933615215383220927084750528293779491002813557093860863839

### 🧑‍💻 Exercise

In [67]:
ipc = random.choice(list(patent_sections.values()))
md(f"##❓ Write a function that returns the number of patents in a given section for a given country in a given year.")

##❓ Write a function that returns the number of patents in a given section for a given country in a given year.

In [68]:
df2.sample(5)

ipc,Human necessities,Human necessities,Human necessities,Human necessities,Human necessities,Human necessities,Human necessities,Human necessities,Human necessities,Human necessities,...,Electricity,Electricity,Electricity,Electricity,Electricity,Electricity,Electricity,Electricity,Electricity,Electricity
unit,NR,NR,NR,NR,NR,NR,NR,NR,NR,NR,...,P_MHAB,P_MHAB,P_MHAB,P_MHAB,P_MHAB,P_MHAB,P_MHAB,P_MHAB,P_MHAB,P_MHAB
country,Belgium,Bulgaria,Czechia,Denmark,Germany,Estonia,Ireland,Greece,Spain,France,...,Netherlands,Austria,Poland,Portugal,Romania,Slovenia,Slovakia,Finland,Sweden,United Kingdom
year,Unnamed: 1_level_3,Unnamed: 2_level_3,Unnamed: 3_level_3,Unnamed: 4_level_3,Unnamed: 5_level_3,Unnamed: 6_level_3,Unnamed: 7_level_3,Unnamed: 8_level_3,Unnamed: 9_level_3,Unnamed: 10_level_3,Unnamed: 11_level_3,Unnamed: 12_level_3,Unnamed: 13_level_3,Unnamed: 14_level_3,Unnamed: 15_level_3,Unnamed: 16_level_3,Unnamed: 17_level_3,Unnamed: 18_level_3,Unnamed: 19_level_3,Unnamed: 20_level_3,Unnamed: 21_level_3
1984,62.54,4.84,,61.22,1073.74,,8.73,1.13,11.74,433.69,...,15.975,4.414,0.027,0.017,0.022,,,0.86,7.158,6.814
2005,300.19,3.73,21.27,377.17,3508.6,0.67,87.63,26.6,348.77,1423.56,...,40.256,28.361,0.565,2.713,0.12,3.089,1.971,105.018,63.158,16.203
1987,75.4,8.08,,62.94,1365.53,,21.39,3.5,37.54,616.23,...,21.149,5.962,0.031,0.008,,,,2.526,8.247,7.766
1992,95.31,1.0,3.37,106.52,1351.92,0.19,17.37,11.62,58.2,727.85,...,22.34,6.7,0.009,0.158,0.014,0.875,,29.401,19.093,7.753
2010,257.85,6.5,19.61,264.35,3270.13,3.2,93.27,16.3,383.05,1351.17,...,35.137,25.223,1.538,1.124,0.287,5.232,1.044,80.59,93.794,16.149


In [69]:
def number_of_patents_per_section_country_year(section, country, year):
  # your code here
  pass

In [70]:
# from Frank
def number_of_patents_per_section_country_year(section, country, year):
  try:
    return df2.loc[year, (section, 'NR', country)]
  except:
    print("opps, this crashed")

number_of_patents_per_section_country_year('Human necessities', 'Germany', 2000)

2690.6

In [71]:
df1.sample(3)

Unnamed: 0,values,freq,ipc,unit,country,year
952,95.7,A,Human necessities,NR,Finland,2004
8955,19.12,A,Fixed constructions,NR,Netherlands,1978
10915,,A,Mechanical engineering; lighting; heating; wea...,NR,Luxembourg,1977


In [72]:
#@title Solution
def number_of_patents_per_section_country_year(section, country, year):
  return df1[(df1['ipc'] == section) & (df1['country'] == country) & (df1['year'] == year) & (df1['unit'] == 'NR')]['values'].values[0]

In [73]:
number_of_patents_per_section_country_year('Human necessities', 'Germany', 2000)

2690.6

In [74]:
ipc = random.choice(list(patent_sections.values()))
md(f"##❓ Write a function that returns the number of patents across all sections for a given country in a given year.")

##❓ Write a function that returns the number of patents across all sections for a given country in a given year.

Hint: Make use of your previous function `number_of_patents_per_section_country_year()`

In [75]:
list(patent_sections.values())

['Human necessities',
 'Performing operations; transporting',
 'Chemistry; metallurgy',
 'Textiles; paper',
 'Fixed constructions',
 'Mechanical engineering; lighting; heating; weapons; blasting',
 'Physics',
 'Electricity']

In [76]:
def number_of_patents_per_country_year(country, year):
  # your code here
  pass

In [77]:
#@title Solution
def number_of_patents_per_country_year(country, year):
  n_patents = 0
  for section in patent_sections.values():
    n_patents += number_of_patents_per_section_country_year(section,
                                                            country,
                                                            year)

  return n_patents

In [78]:
#@title Solution-2
def number_of_patents_per_country_year(country, year):
  return sum([number_of_patents_per_section_country_year(section, country, year) for section in patent_sections.values()])

In [79]:
number_of_patents_per_country_year('Italy', 1999)

3734.8500000000004

## Reassignment of Variables in Functions

In [80]:
def cubic_root(val):
  val **= (1/3) # val = val ** (1/3)
  myNewLocalVariable = 10
  print('Value inside function: ', val)
  print('local Variable: ', myNewLocalVariable)

In [81]:
val = 27
print('Value before function: ', val)

cubic_root(val)

print('Value after function: ', val)

Value before function:  27
Value inside function:  3.0
local Variable:  10
Value after function:  27


In [83]:
#print(myNewLocalVariable)

In [84]:
def cubic_root_global_effect():
  global val
  val **= (1/3)
  print('Value inside function: ', val)

In [85]:
val = 27
print('Value before function: ', val)

cubic_root_global_effect()

print('Value after function: ', val)

Value before function:  27
Value inside function:  3.0
Value after function:  3.0


## Modification of Complex Types in Functions

In [86]:
patents_physics_NL = dict(df2['Physics', 'NR', 'Netherlands'][df2.index < 2000].items())
patents_physics_NL

{1977: 11.46,
 1978: 27.88,
 1979: 68.58,
 1980: 101.9,
 1981: 121.82,
 1982: 146.02,
 1983: 175.06,
 1984: 191.46,
 1985: 215.72,
 1986: 196.41,
 1987: 227.56,
 1988: 257.48,
 1989: 258.56,
 1990: 237.18,
 1991: 251.54,
 1992: 247.36,
 1993: 232.68,
 1994: 240.18,
 1995: 303.21,
 1996: 385.43,
 1997: 429.89,
 1998: 532.18,
 1999: 627.59}

In [87]:
# shallow copy
patents_physics_NL = dict(df2['Physics', 'NR', 'Netherlands'][df2.index < 2000].items())

def add_patents_since_y2k(patents):
  patents.update(dict(df2['Physics', 'NR', 'Netherlands'][df2.index >= 2000].items()))

add_patents_since_y2k(patents_physics_NL)
patents_physics_NL

{1977: 11.46,
 1978: 27.88,
 1979: 68.58,
 1980: 101.9,
 1981: 121.82,
 1982: 146.02,
 1983: 175.06,
 1984: 191.46,
 1985: 215.72,
 1986: 196.41,
 1987: 227.56,
 1988: 257.48,
 1989: 258.56,
 1990: 237.18,
 1991: 251.54,
 1992: 247.36,
 1993: 232.68,
 1994: 240.18,
 1995: 303.21,
 1996: 385.43,
 1997: 429.89,
 1998: 532.18,
 1999: 627.59,
 2000: 775.48,
 2001: 1082.78,
 2002: 984.91,
 2003: 1098.98,
 2004: 997.94,
 2005: 855.95,
 2006: 836.64,
 2007: 699.44,
 2008: 727.58,
 2009: 667.42,
 2010: 566.42,
 2011: 691.03,
 2012: 638.58,
 2013: 433.71}

In [88]:
#deep copy

patents_physics_NL = dict(df2['Physics', 'NR', 'Netherlands'][df2.index < 2000].items())

def add_patents_since_y2k(patents):
  patents.update(dict(df2['Physics', 'NR', 'Netherlands'][df2.index >= 2000].items()))

add_patents_since_y2k(patents_physics_NL.copy())
patents_physics_NL

{1977: 11.46,
 1978: 27.88,
 1979: 68.58,
 1980: 101.9,
 1981: 121.82,
 1982: 146.02,
 1983: 175.06,
 1984: 191.46,
 1985: 215.72,
 1986: 196.41,
 1987: 227.56,
 1988: 257.48,
 1989: 258.56,
 1990: 237.18,
 1991: 251.54,
 1992: 247.36,
 1993: 232.68,
 1994: 240.18,
 1995: 303.21,
 1996: 385.43,
 1997: 429.89,
 1998: 532.18,
 1999: 627.59}

# Lambdas

An anonymous function is a function that is defined without a name. While normal functions are defined using the `def` keyword, anonymous functions are defined using the `lambda` keyword. Anonymous functions are also called lambda functions.

Lambda functions can have any number of arguments but only one expression. The expression is evaluated and returned. Lambda functions can be used wherever function objects are required.

In [89]:
def cubic_root(val):
  return val**(1/3)

for i in [1, 8, 27]:
  print(cubic_root(i))

my_third_powers = list(map(cubic_root, [1, 8, 27]))
my_third_powers

1.0
2.0
3.0


[1.0, 2.0, 3.0]

In [90]:
cubic_root = lambda val : val**(1/3)

for i in [1, 8, 27]:
  print(cubic_root(i))

list(map(cubic_root, [1, 8, 27]))

1.0
2.0
3.0


[1.0, 2.0, 3.0]

In [91]:
list(map(lambda val : val**(1/3), [1, 8, 27]))

[1.0, 2.0, 3.0]

In [92]:
sorted(countries_names.items())

[('AT', 'Austria'),
 ('BE', 'Belgium'),
 ('BG', 'Bulgaria'),
 ('CY', 'Cyprus'),
 ('CZ', 'Czechia'),
 ('DE', 'Germany'),
 ('DK', 'Denmark'),
 ('EE', 'Estonia'),
 ('EL', 'Greece'),
 ('ES', 'Spain'),
 ('FI', 'Finland'),
 ('FR', 'France'),
 ('HR', 'Croatia'),
 ('HU', 'Hungary'),
 ('IE', 'Ireland'),
 ('IT', 'Italy'),
 ('LT', 'Lithuania'),
 ('LU', 'Luxembourg'),
 ('LV', 'Latvia'),
 ('MT', 'Malta'),
 ('NL', 'Netherlands'),
 ('PL', 'Poland'),
 ('PT', 'Portugal'),
 ('RO', 'Romania'),
 ('SE', 'Sweden'),
 ('SI', 'Slovenia'),
 ('SK', 'Slovakia'),
 ('UK', 'United Kingdom')]

In [93]:
sorted(countries_names.items(),
       key = lambda tup: tup[1])

[('AT', 'Austria'),
 ('BE', 'Belgium'),
 ('BG', 'Bulgaria'),
 ('HR', 'Croatia'),
 ('CY', 'Cyprus'),
 ('CZ', 'Czechia'),
 ('DK', 'Denmark'),
 ('EE', 'Estonia'),
 ('FI', 'Finland'),
 ('FR', 'France'),
 ('DE', 'Germany'),
 ('EL', 'Greece'),
 ('HU', 'Hungary'),
 ('IE', 'Ireland'),
 ('IT', 'Italy'),
 ('LV', 'Latvia'),
 ('LT', 'Lithuania'),
 ('LU', 'Luxembourg'),
 ('MT', 'Malta'),
 ('NL', 'Netherlands'),
 ('PL', 'Poland'),
 ('PT', 'Portugal'),
 ('RO', 'Romania'),
 ('SK', 'Slovakia'),
 ('SI', 'Slovenia'),
 ('ES', 'Spain'),
 ('SE', 'Sweden'),
 ('UK', 'United Kingdom')]

### 🧑‍💻 Exercise

In [94]:
md(f"##❓ Write a function that sorts _countries_names_ by the length of the country names.")

##❓ Write a function that sorts _countries_names_ by the length of the country names.

In [95]:
def sorted_by_length():
  # your code here
  pass

sorted_by_length()

# Generators

Functions can `print` something, `return` something, modify something. And they can `yield` something.

A different way of creating output is with the `yield` statement. Functions based on yield are called **generators**.

A `yield` statement acts similar to `return` in the sense that it is used to relay output to the place where the function was called from. Contrary to the `return` statement the `yield` statement does not cause the function stop running! Instead, the function becomes idle and is ready to resume running its code from the previous yield statement when it is called again.

In [96]:
def simpleGeneratorFun():
  yield 1
  yield 2
  yield 3

for value in simpleGeneratorFun():
  print(value)

1
2
3


## From Function to Generator

In [97]:
def squares(n):
  sq = []
  for i in range(n):
    print(f"{i} -> {i*i}")
    sq.append(i*i)
  return sq

for square in squares(10):
  print(square)

0 -> 0
1 -> 1
2 -> 4
3 -> 9
4 -> 16
5 -> 25
6 -> 36
7 -> 49
8 -> 64
9 -> 81
0
1
4
9
16
25
36
49
64
81


In [98]:
def squares(n):
  for i in range(n):
    print(f"{i} -> {i*i}")
    yield i*i

for square in squares(10):
  print(square)

0 -> 0
0
1 -> 1
1
2 -> 4
4
3 -> 9
9
4 -> 16
16
5 -> 25
25
6 -> 36
36
7 -> 49
49
8 -> 64
64
9 -> 81
81


Lambdas Generators and Comprehension combined:

In [99]:
print("\n".join([" ".join(map(lambda i: str(i), range(1, i))) for i in [*range(2, 7), *range(5, 1, -1)]]))

1
1 2
1 2 3
1 2 3 4
1 2 3 4 5
1 2 3 4
1 2 3
1 2
1


# Exercises [Day 3]

![exercise](https://images.unsplash.com/photo-1574790398664-0cb03682ed1c?ixlib=rb-1.2.1&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=2342&q=80)

- [Functions](https://www.w3schools.com/python/exercise.asp?filename=exercise_functions1): exercises 1-6

- [Lambdas](https://www.w3schools.com/python/exercise.asp?filename=exercise_lambda1): exercise 1

# UP NEXT

[Data Analysis](https://colab.research.google.com/drive/1u2QhLYkF6lhc4BWmktD0JhkM5yzvmQuQ?usp=sharing)