# PROGRES - TME2

Fabien Mathieu - fabien.mathieu@normalesup.org

Sébastien Tixeuil - Sebastien.Tixeuil@lip6.fr

**Note**: 
- Star exercises (indicated by *) should only be done if all other exercises have been completed. You 
don't have to do them if you do not want.

# Rules

1. Cite your sources
2. One file to rule them all
3. Explain
4. Execute your code


https://github.com/balouf/progres/blob/main/rules.ipynb

# Exercice 1 - Regular Expressions

Consider the following list:

In [2]:
L = ['marie.Dupond@gmail.com', 'lucie.Durand@wanadoo.fr',
'Sophie.Parmentier @@ gmail.com', 'franck.Dupres.gmail.com',
'pierre.Martin@lip6 .fr ',' eric.Deschamps@gmail.com ']

- Which of these entries are valid?
- Use regular expressions to identify valid *gmail* addresses and display them. 

Answer

The valid entries are `'marie.Dupond@gmail.com'`, `' eric.Deschamps@gmail.com '`. We consider otherwise valid strings which are whitespace-padded to also be valid, as stripping is a simple operation, and this lends itself to a better user experience (if the user doesn't realize there is an invisible space, for example).

In [3]:
import re
import functools
from typing import List

GMAIL_RE = re.compile(r'^\s*([0-9A-Za-z_.]+@gmail.com)\s*')

def _true_gmail_reducer(accumulator: List[str], test_address: str) -> bool:
    gmail_match = GMAIL_RE.match(test_address)
    if not gmail_match: return accumulator
    address = gmail_match.group(1)
    return accumulator + [address]

def true_gmail(mail_list: List[str]) -> List[str]:
    return functools.reduce(_true_gmail_reducer, mail_list, [])

### Explanation

The `true_gmail` transforms a list of strings to a list of found, whitespace-stripped, gmail addresses. Because values of the output list may be transformed from those of the input list, a `reduce` is used in place of a `filter`. 

The reducer implements the logic. It tests against a gmail regex and implements two cases:
1. If there is no match, throw out the address by returning the unchanged accumulator
2. Otherwise, continue to the next iteration with the desired portion of the address, by returning the accumulator with the address portion appended

Note that `+` is used for list extension rather than `.append`. This is to prevent any unexpected behavior that could come from mutation.

In [4]:
true_gmail(L)

['marie.Dupond@gmail.com', 'eric.Deschamps@gmail.com']

- Use regular expressions to check if a string ends with a number. 

Answer

In [5]:
def ends_with_number(txt: str) -> bool:
    return bool(re.match(r'^.*\d$', txt))

### Explanation

`ends_with_number` checks for a match of the given parameter against the regular expression `^.*\d$`. The regular expression could be worded in English as: "match anything from the beginning of the string, then match a number followed by the string end".

`re.match` returns a `Match` object if a match is present, and `None` otherwise, but `ends_with_number` wants to return a boolean indicating yes or no. The result of `re.match` is transformed to the desired output by simply being passed to `bool`.

In [6]:
ends_with_number('to42to')

False

In [7]:
ends_with_number('to42to666')

True

- Use regular expressions to remove problematic zeros from an IPv4 address expressed as a 
string. (example: "216.08.094.196" should become "216.8.94.196", but "216.80.140.196" 
should remain "216.80.140.196"). 

Answer

In [8]:
IPV4_FIELD_RE = re.compile(r'0*(\d{1,3})')

def normalize_ip(txt):
    return '.'.join(IPV4_FIELD_RE.findall(txt))

### Explanation

`normalize_ip` uses a regular expression to match the desired substring for each sequence within an IPv4 address. The list of desired sequences is taken using `.findall`, which is then re-formatted to an IPv4 string using `'.'.join`. 

The regular expression used is `0*(\d{1,3})`. There are two parts to this expression:
1. `0*` matches 0 or more of the character `0`, at the beginning of the sequence, outside the capture group
2. `(\d{1,3})` matches 1-3 digits in a row for a sequence, and puts them in a capture group

The first part enables excluding leading `0`s from the capture group, while not requiring leading `0`s to match. The second part matching at least 1 digit enables capturing a `0` if it is the actual value of the sequence. e.g: The edge case `'000'` matches only the last `0` within the capture group.

In [9]:
normalize_ip("216.0.094.196")

'216.0.94.196'

In [10]:
normalize_ip("216.08.094.196")

'216.8.94.196'

In [11]:
normalize_ip("216.80.140.196")

'216.80.140.196'

In [12]:
normalize_ip("000.00.0.000")

'0.0.0.0'

- Use regular expressions to transform a date from MM-DD-YYYY format to DD-MM-YYYY 
format. (example "11-06-2020" should become "06-11-2020"). Optionally*, do the same thing using the `datetime` package.

Answer

In [13]:
DATE_RE = re.compile(r'^(\d{2})-(\d{2})-(\d{4})$')

def switch_md(txt: str) -> str:
    mm, dd, yyyy = DATE_RE.match(txt).groups()
    return '-'.join([dd, mm, yyyy])

### Explanation

`switch_md` uses a regex to match a full date string and grab groups of each section, then re-orders and re-joins them to the desired format.

Note that it is assumed the `txt` parameter matches this format, and does not define behavior for when this is not the case.

In [14]:
switch_md("11-06-2020")

'06-11-2020'

# Exercice 2 - Analyze XML

- Write a Python code that retrieves the content of the page at:

In [15]:
url = "https://www.w3schools.com/xml/cd_catalog.xml"

In [16]:
from requests import Session
import xml.etree.ElementTree as ET

s = Session()
r = s.get(url)

### Explanation

To retrieve the URL content, `Sessions.get` is used, to give the option to keep cookies and re-use a TCP connection if we were making multiple requests.

- Look at the text content and load as xml.

In [17]:
print(r.text)
cds = ET.fromstring(r.text)
print(f"Main tag: {cds.tag}; main attributes: {cds.attrib}")

<?xml version="1.0" encoding="UTF-8"?>
<CATALOG>
  <CD>
    <TITLE>Empire Burlesque</TITLE>
    <ARTIST>Bob Dylan</ARTIST>
    <COUNTRY>USA</COUNTRY>
    <COMPANY>Columbia</COMPANY>
    <PRICE>10.90</PRICE>
    <YEAR>1985</YEAR>
  </CD>
  <CD>
    <TITLE>Hide your heart</TITLE>
    <ARTIST>Bonnie Tyler</ARTIST>
    <COUNTRY>UK</COUNTRY>
    <COMPANY>CBS Records</COMPANY>
    <PRICE>9.90</PRICE>
    <YEAR>1988</YEAR>
  </CD>
  <CD>
    <TITLE>Greatest Hits</TITLE>
    <ARTIST>Dolly Parton</ARTIST>
    <COUNTRY>USA</COUNTRY>
    <COMPANY>RCA</COMPANY>
    <PRICE>9.90</PRICE>
    <YEAR>1982</YEAR>
  </CD>
  <CD>
    <TITLE>Still got the blues</TITLE>
    <ARTIST>Gary Moore</ARTIST>
    <COUNTRY>UK</COUNTRY>
    <COMPANY>Virgin records</COMPANY>
    <PRICE>10.20</PRICE>
    <YEAR>1990</YEAR>
  </CD>
  <CD>
    <TITLE>Eros</TITLE>
    <ARTIST>Eros Ramazzotti</ARTIST>
    <COUNTRY>EU</COUNTRY>
    <COMPANY>BMG</COMPANY>
    <PRICE>9.90</PRICE>
    <YEAR>1997</YEAR>
  </CD>
  <CD>
    <TITLE>

### Explanation

To load the result as XML, `ElementTree.fromstring` is used, for simplicity's sake.

Answer

- Write a `display_cd` function that displays (i.e. `print`), for a CD: title, artist, country, company, year.
- Display all CDs.

Answer

In [18]:
def display_cd(cd: ET) -> None:
    properties = [f'{child.tag}: {child.text}' for child in cd]
    print(', '.join(properties))

### Explanation

The chosen format for displaying a CD is to display all child tags and their text content, separated by commas. This is done by first creating a list of tags + values with the desired format, and then utilizing `.join` to easily intersperse commas, and printing the result.

- Display all 1980s CDs. 

In [19]:
for cd in cds:
  display_cd(cd)

TITLE: Empire Burlesque, ARTIST: Bob Dylan, COUNTRY: USA, COMPANY: Columbia, PRICE: 10.90, YEAR: 1985
TITLE: Hide your heart, ARTIST: Bonnie Tyler, COUNTRY: UK, COMPANY: CBS Records, PRICE: 9.90, YEAR: 1988
TITLE: Greatest Hits, ARTIST: Dolly Parton, COUNTRY: USA, COMPANY: RCA, PRICE: 9.90, YEAR: 1982
TITLE: Still got the blues, ARTIST: Gary Moore, COUNTRY: UK, COMPANY: Virgin records, PRICE: 10.20, YEAR: 1990
TITLE: Eros, ARTIST: Eros Ramazzotti, COUNTRY: EU, COMPANY: BMG, PRICE: 9.90, YEAR: 1997
TITLE: One night only, ARTIST: Bee Gees, COUNTRY: UK, COMPANY: Polydor, PRICE: 10.90, YEAR: 1998
TITLE: Sylvias Mother, ARTIST: Dr.Hook, COUNTRY: UK, COMPANY: CBS, PRICE: 8.10, YEAR: 1973
TITLE: Maggie May, ARTIST: Rod Stewart, COUNTRY: UK, COMPANY: Pickwick, PRICE: 8.50, YEAR: 1990
TITLE: Romanza, ARTIST: Andrea Bocelli, COUNTRY: EU, COMPANY: Polydor, PRICE: 10.80, YEAR: 1996
TITLE: When a man loves a woman, ARTIST: Percy Sledge, COUNTRY: USA, COMPANY: Atlantic, PRICE: 8.70, YEAR: 1987
TITLE

### Explanation

The root element has CDs as sub-elements. Since `display_cd` expects a single CD record, we iterate through the root and pass each child to `display_cd`.

Answer

- Display all British CDs.

In [20]:
british_cds = cds.findall("CD[COUNTRY='UK']")
for bcd in british_cds:
  display_cd(bcd)


TITLE: Hide your heart, ARTIST: Bonnie Tyler, COUNTRY: UK, COMPANY: CBS Records, PRICE: 9.90, YEAR: 1988
TITLE: Still got the blues, ARTIST: Gary Moore, COUNTRY: UK, COMPANY: Virgin records, PRICE: 10.20, YEAR: 1990
TITLE: One night only, ARTIST: Bee Gees, COUNTRY: UK, COMPANY: Polydor, PRICE: 10.90, YEAR: 1998
TITLE: Sylvias Mother, ARTIST: Dr.Hook, COUNTRY: UK, COMPANY: CBS, PRICE: 8.10, YEAR: 1973
TITLE: Maggie May, ARTIST: Rod Stewart, COUNTRY: UK, COMPANY: Pickwick, PRICE: 8.50, YEAR: 1990
TITLE: For the good times, ARTIST: Kenny Rogers, COUNTRY: UK, COMPANY: Mucik Master, PRICE: 8.70, YEAR: 1995
TITLE: Tupelo Honey, ARTIST: Van Morrison, COUNTRY: UK, COMPANY: Polydor, PRICE: 8.20, YEAR: 1971
TITLE: The very best of, ARTIST: Cat Stevens, COUNTRY: UK, COMPANY: Island, PRICE: 8.90, YEAR: 1990
TITLE: Stop, ARTIST: Sam Brown, COUNTRY: UK, COMPANY: A and M, PRICE: 8.90, YEAR: 1988
TITLE: Bridge of Spies, ARTIST: T'Pau, COUNTRY: UK, COMPANY: Siren, PRICE: 7.90, YEAR: 1987
TITLE: Private

### Explanation

This code uses XPath to find all British CDs. It does this by selecting all `CD` tags which have a sub-tag `COUNTRY` with the text value `UK`.

Reference: [XPath section of the ElementTree docs](https://docs.python.org/3/library/xml.etree.elementtree.html#xpath-support)

Answer

# Exercice 3 - Analyze JSON

- Write a Python program that gets the file of filming locations in Paris at: 

In [21]:
url = "https://opendata.paris.fr/explore/dataset/lieux-de-tournage-a-paris/download/?format=json&timezone=Europe/Berlin&lang=fr"

- How many entries have you got?

In [37]:
import json
from pathlib import Path

def download(source_url, dest_file):
  s = Session()
  s.verify = False
  r = s.get(source_url, stream=True)
  dest_file = Path(dest_file)

  with open(dest_file, 'wb') as f:
    for chunk in r.iter_content(chunk_size=8192):
      if chunk:
        f.write(chunk)

FN = 'tournage.json'
download(url, FN)

with open(FN) as f:
  locs = json.load(f)

print('Entry count:', len(locs))



Entry count: 12265


### Explanation

This code makes use of the sample `download` function from the slides. The JSON file is downloaded to `tournage.json`, which is then re-opened to analyze. Since there is an array at the root, `len` is simply called on the loaded JSON to get the entry count.

Answer

- Analyze the JSON file: what is its structure?
- Write a function that converts an entry in a string that shows director, title, district, start date, end date, and geographic coordinates.
- Convert all entries in strings (warning: some entries may have issues).
- Display the first 20 entries.

Answer

In [60]:
def display_loc(entry):
    fields = entry['fields']
    director = fields.get('nom_realisateur', '<director missing>')
    title = fields.get('nom_tournage', '<title missing>')
    district = fields.get('ardt_lieu', '<district missing>')
    start_date = fields.get('date_debut', '<start date missing>')
    end_date = fields.get('date_fin', '<end date missing>')
    coord_x = fields.get('coord_x', '<x coordinate missing>')
    coord_y = fields.get('coord_y', '<y coordinate missing>')

    return f"{director}'s \"{title},\" filmed in {district} ({coord_x}, {coord_y}) from {start_date} to {end_date}"

### Explanation

Metadata for each entry is stored in the `'fields'` key, however there may be missing fields for each entry. To safeguard for this, `dict.get` is used to give a default value in the case of a missing key.

### File structure

The JSON structure is an array of entries. The following is a formatted entry, to give an example of real data:

```json
{
   "datasetid":"lieux-de-tournage-a-paris",
   "recordid":"0ff321c5b140a12a8e50a1b212a7c5f5bced91d7",
   "fields":{
      "coord_x":2.37006242,
      "id_lieu":"2017-751",
      "adresse_lieu":"rue du faubourg du temple, 75011 paris",
      "geo_shape":{
         "coordinates":[
            2.370062415669748,
            48.8696979988026
         ],
         "type":"Point"
      },
      "coord_y":48.869698,
      "ardt_lieu":"75011",
      "nom_tournage":"2 Fils (Nouvelle Demande Décor Librairie / Journées interverties)",
      "nom_realisateur":"Félix MOATI",
      "date_debut":"2017-10-19",
      "type_tournage":"Long métrage",
      "annee_tournage":"2017",
      "nom_producteur":"NORD OUEST FILMS",
      "date_fin":"2017-10-19",
      "geo_point_2d":[
         48.8696979988026,
         2.370062415669748
      ]
   },
   "geometry":{
      "type":"Point",
      "coordinates":[
         2.370062415669748,
         48.8696979988026
      ]
   },
   "record_timestamp":"2024-01-31T13:40:46.402+01:00"
}
```

Each entry may be missing specific keys from `"fields"`. 

In [63]:
all_entries = [display_loc(e) for e in locs]
print('\n'.join(all_entries[:20]))

Félix MOATI's "2 Fils (Nouvelle Demande Décor Librairie / Journées interverties)," filmed in 75011 (2.37006242, 48.869698) from 2017-10-19 to 2017-10-19
Cathy Verney's "Vernon Subutex," filmed in 75001 (2.34248745, 48.85849331) from 2018-04-25 to 2018-04-26
Olivier Barma's "LEBOWITZ CONTRE LEBOWITZ 2," filmed in 75010 (2.36463505, 48.87597364) from 2017-06-01 to 2017-06-01
cheyenne carron's "À jamais fidèle," filmed in 75020 (2.39860034, 48.85154734) from 2017-08-24 to 2017-08-25
ZABOU BREITMAN's "CHRONIQUES PARISIENNES 16," filmed in 75013 (2.38127943, 48.82655665) from 2017-04-18 to 2017-04-18
Matthieu MARES-SAVELLI's "LOLYWOOD - DANS TES REVES LE SPORT," filmed in 75019 (2.39778751, 48.89300518) from 2017-04-13 to 2017-04-13
Hervé Mimran's "Un homme pressé," filmed in 75012 (2.36913602, 48.84258571) from 2017-05-23 to 2017-05-24
Cédric ANGER's "L'AMOUR EST UNE FÊTE," filmed in 75018 (2.33709787, 48.88267038) from 2017-06-14 to 2017-06-14
<director missing>'s "LEBOWITZ CONTRE LEBOWIT

- A same movie can have multiple shooting locations. Make a list of movies, where each entry contains the movie title, its director, and shootings locations (district, start date, end date).
- How many movies do you have?
- Write a function that converts a movie into a string that shows director, title, and shootings.
- Convert all movies in strings.
- Display the first 20 entries.

Answer

In [70]:
from typing import Dict, TypeVar, List

Movie = TypeVar('Movie')
movies: Dict[str, Movie] = dict()

for loc in locs:
  title = loc['fields']['nom_tournage']
  if title not in movies:
    movies[title] = {
      'title': title,
      'director': loc['fields'].get('nom_realisateur', '<director missing>'),
      'shootings': []
    }
  movies[title]['shootings'].append({
    'district': loc['fields'].get('ardt_lieu', '<arrondissement missing>'),
    'start_date': loc['fields']['date_debut'],
    'end_date': loc['fields']['date_fin']
  })

# Regroup locations per movie
movies: List[Movie] = [m for m in movies.values()]

### Explanation

The question asks for two tasks to be accomplished:
1. Entries are grouped by which movie they are a part of
2. A subset of fields is displayed from each movie, including the newly aggregated field of shooting locations

The most straightforward way to create this aggregation is via a dictionary. The movie title is chosen as the key, as there are no better unique identifier fields referencing the movie itself. 

While this organization is being done, the opportunity is taken to normalize the data into a new structure containing exactly what we need, and with no fields missing:

```json
A Movie is a dictionary with the schema:

{
  "title": "string",
  "director": "string",
  "shootings": [
    {
      "district": "string",
      "start_date": "string",
      "end_date": "string"
    },
    ...
  ]
}
```

Since the top-level dictionary was only needed for the process of organization, and not for the final data representation, we re-organize all of its values into a list for the final `movies` variable.

In [71]:
len(movies)

1476

In [72]:
def display_movie(movie):
    movie_str = f"{movie['director']}'s \"{movie['title']},\" was filmed in the following locations:\n"
    for shooting in movie['shootings']:
        movie_str += f'- {shooting['district']} between {shooting['start_date']} and {shooting['end_date']}\n'
    return movie_str

In [73]:
all_movie_displays = [display_movie(m) for m in movies]
print('\n'.join(all_movie_displays[:20]))

Félix MOATI's "2 Fils (Nouvelle Demande Décor Librairie / Journées interverties)," was filmed in the following locations:
- 75011 between 2017-10-19 and 2017-10-19
- 75011 between 2017-10-19 and 2017-10-19

Cathy Verney's "Vernon Subutex," was filmed in the following locations:
- 75001 between 2018-04-25 and 2018-04-26
- 75019 between 2018-05-22 and 2018-05-22
- 75019 between 2018-05-25 and 2018-05-25
- 75010 between 2018-05-03 and 2018-05-06
- 75011 between 2018-03-19 and 2018-03-19
- 75011 between 2018-06-01 and 2018-06-02
- 75014 between 2018-06-05 and 2018-06-14
- 75014 between 2018-06-13 and 2018-06-13
- 75019 between 2018-05-22 and 2018-05-22
- 75009 between 2018-04-11 and 2018-04-11
- 75007 between 2018-06-13 and 2018-06-15
- 75012 between 2018-03-23 and 2018-03-23
- 75011 between 2018-04-04 and 2018-04-04
- 75016 between 2018-03-30 and 2018-03-30
- 75004 between 2018-03-20 and 2018-03-20
- 75004 between 2018-04-26 and 2018-04-27
- 75012 between 2018-08-28 and 2018-08-30
- 75011

- Display for each district its number of shootings. 

Answer

In [37]:
from typing import Dict

def district_count_reducer(acc: Dict[str, int], )

stats

{'75018': 1043,
 '75008': 798,
 '75010': 749,
 '75019': 745,
 '75001': 722,
 '75004': 670,
 '75013': 658,
 '75007': 657,
 '75009': 642,
 '75011': 641,
 '75005': 640,
 '75016': 614,
 '75012': 596,
 '75020': 587,
 '75006': 471,
 '75116': 421,
 '75017': 378,
 '75015': 363,
 '75014': 321,
 '75002': 297,
 '75003': 236,
 '93500': 6,
 '94320': 4,
 '???': 1,
 '93320': 1,
 '92220': 1,
 '92170': 1,
 '93200': 1,
 '93000': 1}

# Exercice 4 - Analyze CSV

- Write a Python code retrieves the file of the most loaned titles in libraries in Paris at: 

In [38]:
url = "https://opendata.paris.fr/explore/dataset/les-titres-les-plus-pretes/download/?format=csv&timezone=Europe/Berlin&lang=en&use_labels_for_header=true&csv_separator=%3B"

Answer

- Analyze the resulting CSV file to display, for all entries: title, author, and total number of loans.

Answer

In [40]:
def disp_book(book):
    ...

In [42]:
print('\n'.join( [disp_book(b) for b in books[:20]]))

"Razzia", by Sobral,  Patrick (2938 loans)
"Touche pas à mon veau", by Guibert,  Emmanuel (2296 loans)
"Max et Lili vont chez papy et mamie", by Saint-Mars,  Dominique de (5554 loans)
"Lili veut un petit chat", by Saint-Mars,  Dominique de (5789 loans)
"Max et Lili font du camping", by Saint-Mars,  Dominique de (5658 loans)
"Lili trouve sa maîtresse méchante", by Saint-Mars,  Dominique de (4694 loans)
"J'irai où tu iras", by Lyfoung,  Patricia (4707 loans)
"Les nerfs à vif", by Nob (2837 loans)
"Je crois que je t'aime", by Lyfoung,  Patricia (3878 loans)
"Attention tornade", by Cazenove,  Christophe (2366 loans)
"Max et Lili se posent des questions sur Dieu", by Saint-Mars,  Dominique de (4823 loans)
"Game over. 13. Toxic affair", by Midam (2652 loans)
"Les Schtroumpfs et la tempête blanche", by Jost,  Alain (975 loans)
"On a marché sur la lune", by Hergé (5674 loans)
"Astérix chez les Bretons", by Goscinny,  René (3014 loans)
"Parvati", by Ogaki,  Philippe (2616 loans)
"Les Schtroumpf

- Display for each type of document (there can be several entries for the same type of document), the total number of loans for this type. 

Answer

In [44]:
stats

{'Bande dessinée jeunesse': 2300143,
 'Livre jeunesse': 104067,
 'Bande dessinée adulte': 59726,
 'Livre adulte': 41731,
 'Bande dessinée ado': 29819,
 'Livre sonore jeunesse': 10630,
 'Jeux de société prêtable': 10057,
 'Musique jeunesse': 4792,
 'Jeux vidéos tous publics Non prêtables': 4235,
 'DVD jeunesse': 2471,
 'Jeux de société': 1753}

- Display titles in order of profitability (in descending order of the number of loans per copy).

In [46]:
print('\n'.join( [disp_book(b) for b in sorted_books[:20]]))

"Console Nintendo Switch" (1648 loans, 2 copies)
"Console PlayStation 4" (2587 loans, 6 copies)
"SOS ouistiti :" (1868 loans, 5 copies)
"Quatre en ligne :" (1753 loans, 5 copies)
"Perplexus : : original" (2254 loans, 8 copies)
"Un enfant chez les schtroumpfs", by Díaz Vizoso,  Miguel (4504 loans, 43 copies)
"Mon meilleur ami", by Verron,  Laurent (4662 loans, 47 copies)
"Les vacances infernales", by Cohen,  Jacqueline (5014 loans, 51 copies)
"Bande de sauvages !", by Cohen,  Jacqueline (5761 loans, 60 copies)
"Trop, c'est trop !", by Cohen,  Jacqueline (4504 loans, 47 copies)
"Les fous du mercredi", by Cohen,  Jacqueline (5169 loans, 54 copies)
"Ca va chauffer !", by Cohen,  Jacqueline (4071 loans, 44 copies)
"Uno :" (3136 loans, 34 copies)
"Ca roule !", by Cohen,  Jacqueline (5763 loans, 63 copies)
"Salut, les zinzins !", by Cohen,  Jacqueline (4565 loans, 50 copies)
"Les deux terreurs", by Cohen,  Jacqueline (3999 loans, 44 copies)
"Subliiiimes !", by Cohen,  Jacqueline (5007 loans, 

# Exercice 5 * - Analyze HTML

- Write a Python program that gets the content of the Wikipedia page at: 

In [47]:
url = "https://en.wikipedia.org/wiki/List_of_countries_and_dependencies_by_population_density"

Answer

- Display all the countries mentioned in the table. 

Answer

In [50]:
countries

['Monaco',
 'Singapore',
 'Bahrain',
 'Maldives',
 'Malta',
 'Vatican City',
 'Bangladesh',
 'Taiwan',
 'Mauritius',
 'Barbados',
 'Nauru',
 'San Marino',
 'Rwanda',
 'South Korea',
 'Lebanon',
 'Burundi',
 'Tuvalu',
 'India',
 'Netherlands',
 'Haiti',
 'Israel',
 'Philippines',
 'Belgium',
 'Comoros',
 'Grenada',
 'Sri Lanka',
 'Japan',
 'El Salvador',
 'Pakistan',
 'Trinidad and Tobago',
 'Vietnam',
 'Saint Lucia',
 'United Kingdom',
 'Saint Vincent and the Grenadines',
 'Jamaica',
 'Luxembourg',
 'Liechtenstein',
 'Gambia',
 'Nigeria',
 'Kuwait',
 'São Tomé and Príncipe',
 'Seychelles',
 'Qatar',
 'Germany',
 'Dominican Republic',
 'Marshall Islands',
 'Malawi',
 'North Korea',
 'Antigua and Barbuda',
 'Switzerland',
 'Nepal',
 'Uganda',
 'Italy',
 'Kiribati',
 'Saint Kitts and Nevis',
 'Andorra',
 'Guatemala',
 'Micronesia',
 'Togo',
 'Kosovo',
 'China',
 'Cape Verde',
 'Isle of Man',
 'Indonesia',
 'Tonga',
 'Ghana',
 'Thailand',
 'Denmark',
 'Cyprus',
 'United Arab Emirates',
 'T

- Display for each country its rank, density, population, area. 

Answer

- Save the information obtained in a Python dictionary. 

Answer

- Using the previously saved Python dictionary, ask the user for a country, display the 
corresponding information.

Answer

# Exercice 6 * - API Web

- Write a Python program that will make available a Web API allowing elementary calculations on 
integers.

The APIs are accessible by GET and in the form: 
- /add/{integer1}/{integer2}: add integer1 and integer2
- /sub/{integer1}/{integer2}: perform the subtraction of integer1 and integer2
- /mul/{integer1}/{integer2}: carry out the multiplication of integer1 and integer2
- /div/{integer1}/{integer2}: perform the integer division of integer1 by integer2
- /mod/{integer1}/{integer2}: perform the remainder of the integer division of integer1
by integer2

Answer

In [52]:
app.run(host='localhost', port=8080)

 * Serving Flask app '__main__'
 * Debug mode: off


 * Running on http://localhost:8080
Press CTRL+C to quit
127.0.0.1 - - [11/Oct/2024 09:03:41] "GET /mod/42/8 HTTP/1.1" 200 -


http://localhost:8080/mul/6/7

http://localhost:8080/div/42/8

http://localhost:8080/mod/42/8

- Write a Python program that will test the web API made available through the requests
library. 

Answer