Programming Fundamentals
============================

In this notebook, different data from the web will be obtained using several methods (web API manually, web API libraries and web scrapping).

### Exercice 1

Using the [request](http://docs.python-requests.org/) library, a manually request to an API can be made. For example:

`Response = requests.get ('http://api.postcodes.io/postcodes/E98%201TT')`

When the request is performed, an object is recovered that contains, among other things, the following attributes:**status.code**, **content** and **headers**. Search information about the different **status.code** and complete the https error codes' chart:

Main http error codes:
- 200: OK
- 301: Moved Permanently. The resource has been moved permanently.
- 400: Bad Request.The server cannot or will not process the request due to an apparent client error (e.g., malformed request syntax)
- 401: Unauthorized. It appears when authentication is required and it has failed or has not yet been provided
- 403: Forbidden. The request contained valid data and was understood by the server, but the server is refusing the action.
- 404: Not found. The requested resource could not be found but may be available in the future.
- 500: Internal Server Error. A generic error message, given when an unexpected condition was encountered and no more specific message is suitable.
- 501: The server either does not recognize the request method, or it lacks the ability to fulfilÇl the request. Usually this implies future availability.

### Exercice 2

In this exercice we will try to make a request to 4 different websites using the http protocol's GET. GET is implemented in `requests.get`.

Obtain using `request.get` the content and the status.code of the following websites:

- http://google.com
- http://wikipedia.org
- https://mikemai.net/
- http://google.com/noexisto

For each website show:
- First 80 characters of the content
- The `status.code`

In [1]:
# Import library
import requests

urls = ["http://google.com",
        "http://wikipedia.org",
        "https://mikemai.net",
        "http://google.com/noexisto"]

for u in urls:
    r = requests.get(u)
    print("First characters of: %s" % u)
    print("-"*60)
    try:
        print(r.text.split("content=\"")[1][:60])
    except:
        print("error")
    print("-"*60)
    print("Status code: %d \n" % r.status_code)

First characters of: http://google.com
------------------------------------------------------------
Google.es permite acceder a la información mundial en castel
------------------------------------------------------------
Status code: 200 

First characters of: http://wikipedia.org
------------------------------------------------------------
Wikipedia is a free online encyclopedia, created and edited 
------------------------------------------------------------
Status code: 200 

First characters of: https://mikemai.net
------------------------------------------------------------
error
------------------------------------------------------------
Status code: 406 

First characters of: http://google.com/noexisto
------------------------------------------------------------
initial-scale=1, minimum-scale=1, width=device-width">
  <ti
------------------------------------------------------------
Status code: 404 



### Exercice 3
In this exercice we will have fun with cats. *Cat-facts* has the API: https://cat-fact.herokuapp.com. This API has two access points:

- **/facts**
- **/users**

Each fact is structured in the following way:


|    Key    |      Type     |                                              Description                                              |   |   |
|:---------:|:-------------:|:-----------------------------------------------------------------------------------------------------:|---|---|
| _id       | ObjectId      | Unique ID for the Fact                                                                                |   |   |
| _v        | Number        | Version number of the Fact                                                                            |   |   |
| user      | ObjectId      | ID of the User who added the Fact                                                                     |   |   |
| text      | String        | The Fact itself                                                                                       |   |   |
| updatedAt | Timestamp     | Date in which Fact was last modified                                                                  |   |   |
| sendDate  | Timestamp     | If the Fact is meant for one time use, this is the date that it is used                               |   |   |
| deleted   | Boolean       | Whether or not the Fact has been deleted (Soft deletes are used)                                      |   |   |
| source    | String (enum) | Can be 'user' or 'api', indicates who added the fact to the DB                                        |   |   |
| used      | Boolean       | Whether or not the Fact has been sent by the CatBot. This value is reset each time every Fact is used |   |   |
| type      | String        | Type of animal the Fact describes (e.g. ‘cat’, ‘dog’, ‘horse’)                                        |   |   |

To obtain the **fact** with id *58e0086f0aac31001185ed02*, we must build the request:

- *https://cat-fact.herokuapp.com/facts/58e0086f0aac31001185ed02*

The resulting object will contain the previous table information in a json format. 

a) Make a request, transform the result into a dictionary and show the result of the previous table for a fact with and id *58e0086f0aac31001185ed02*.


In [2]:
import requests
import json

u = "https://cat-fact.herokuapp.com"
accessPoint = "facts"
factnumber = "58e0086f0aac31001185ed02"

# Create url
f = requests.get('/'.join((u, accessPoint, factnumber)))

# Transform to dictionary
f_dict = json.loads(f.text)
f_dict

{'status': {'verified': True, 'sentCount': 1},
 'type': 'cat',
 'deleted': False,
 '_id': '58e0086f0aac31001185ed02',
 'user': {'name': {'first': 'Kasimir', 'last': 'Schulz'},
  'photo': 'https://lh6.googleusercontent.com/-BS_rskGd3kA/AAAAAAAAAAI/AAAAAAAAADg/yAxrX9QabMg/photo.jpg?sz=200',
  '_id': '58e007480aac31001185ecef'},
 'text': "Cats can't taste sweetness.",
 '__v': 0,
 'source': 'https://www.scientificamerican.com/article/strange-but-true-cats-cannot-taste-sweets/',
 'updatedAt': '2020-08-29T20:20:03.172Z',
 'createdAt': '2018-03-16T20:20:03.622Z',
 'used': True}

b) For the following *fact ids*:

- *5d38bdab0f1c57001592f156*
- *5ed11e643c15f700172e3856*
- *5ef556dff61f300017030d4c*
- *5d9d4ae168a764001553b388*


Obtain the fields *type*, *user*, *user*, *source*, *used*, *text* and print them on this format:

`Type: cat	User: 58e007480aac31001185ecef
Used: True	Id: 58e0086f0aac31001185ed02
Source: https://www.scientificamerican.com/article/strange-but-true-cats-cannot-taste-sweets/
Text: Cats can't taste sweetness.`


In [3]:
import requests
import json

u = "https://cat-fact.herokuapp.com"
facts = "facts"


factnumbers = ["5d38bdab0f1c57001592f156",
               "5ed11e643c15f700172e3856",
               "5ef556dff61f300017030d4c",
               "5d9d4ae168a764001553b388"]

for fn in factnumbers:
    f = requests.get('/'.join((u, facts, fn)))
    f_dict = json.loads(f.text)
    strformat = "Type: %s\tUser: %s\nUsed: %s\tId: %s\nSource: %s\nText: %s\n\n"
    print(strformat % (f_dict['type'],
                       f_dict['user'],
                       f_dict['used'],
                       f_dict['_id'],
                       f_dict['source'],
                       f_dict['text']))

Type: cat	User: {'name': {'first': 'Alex', 'last': 'Wohlbruck'}, 'photo': 'https://lh3.googleusercontent.com/a-/AOh14GhYgUCf9yFuj-Xt6_X_cDz-5gSusrGde-lerdKqXxA=s50', '_id': '5a9ac18c7478810ea6c06381'}
Used: False	Id: 5d38bdab0f1c57001592f156
Source: user
Text: While some cats love being brushed, others don't take to it naturally. Try to groom your cat in the same spot at the same time of day to create a sense of routine.


Type: cat	User: {'name': {'first': 'Luciano', 'last': 'Garrido Sepulveda'}, 'photo': 'https://cat-fact.herokuapp.com/img/res/avatars/user-face.png', '_id': '5ed11e353c15f700172e3855'}
Used: False	Id: 5ed11e643c15f700172e3856
Source: user
Text: Los gatos tienen más huesos que los seres humanos, nos ganan por 24.


Type: cat	User: {'name': {'first': 'Andrew', 'last': 'Pobrica'}, 'photo': 'https://cat-fact.herokuapp.com/img/res/avatars/user-face.png', '_id': '5e1a9b981fd6150015fa736f'}
Used: False	Id: 5ef556dff61f300017030d4c
Source: user
Text: Lucy, the oldest cat ever

### Exercice 4

In the previous exercices, we have used the request library to manually download the content of an API. However, there are special libraries that facilitate the access to certain APIs but an identification is required.

Use the documentation to obtain the 4 authentification codes required to use the Twitter's API **tweepy**. Using this library, program two functions:

- The first one will authentify the user using the 4 codes obtained in the register. 
- The second one will receive the object `twepy.models.User` and will print:
    1. Number of tweets of the user
    2. Number of friends
    3. Number of followers
    4. The `Screen_name` and `name` of the first 10 friends with their descriptions. 
    
Execute both functions using the **Space_Station** twitter user. 

In [4]:
import tweepy


def InitUser(username):

    consumer_key = 'cgjQlNPlYdY6xCx54p9aIqw2p'
    consumer_secret = 'okrnv66Q96Hk7S4vvtU72OAgG6ilDbTUgFnARZx9dxfHqDayNV'
    access_token = '1292698694794477569-KjCUFsidhoZM9mKp2PVZHLGm6bNykY'
    access_secret = 'c9nYZOVqnon5FV1LlITbIt0lK4aML3tT3UvKXHyNgOgBf'

    #  Initialize interaction with the API
    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_secret)
    api = tweepy.API(auth)

    # Obtain user info
    user = api.get_user(username)

    # Return a tuple
    return (user, api)

In [5]:
def InfoUser(user):
    
    print("Number of followers: {}".format(user.followers_count))
    print("Number of friends: {}".format(user.friends_count))
    print("Number of tweets: {}\n".format(user.statuses_count))
    
    for i, f in enumerate(user.friends()[:10]):
        print("N: %d\tName: %s\tScreenName: %s\tDescription:%s\n" %
              (i+1, f.name, f.screen_name, f.description))

In [6]:
(user, api) = InitUser("Space_Station")
InfoUser(user)

Number of followers: 4340700
Number of friends: 218
Number of tweets: 13918

N: 1	Name: Zebulon Scoville	ScreenName: Explorer_Flight	Description:86th NASA Flight Director. Lucky husband and father. Always looking for a challenge. Tweets are my own, so don't blame NASA.

N: 2	Name: Stephanie Wilson	ScreenName: Astro_Stephanie	Description:

N: 3	Name: Jim Morhard	ScreenName: jmorhard	Description:

N: 4	Name: Bob Cabana	ScreenName: Astro_CabanaBob	Description:Former astronaut and current Center Director of Kennedy Space Center.

N: 5	Name: Sergey Kud-Sverchkov	ScreenName: KudSverchkov	Description:Космонавт Роскосмоса (@Roscosmos) Сергей Кудь-Сверчков
//
@Roscosmos cosmonaut Sergey Kud-Sverchkov

N: 6	Name: U.S. Space Command	ScreenName: US_SpaceCom	Description:The OFFICIAL Twitter Page of United States Space Command, the 11th Combatant Command in the Department of Defense. #USSPACECOM

N: 7	Name: Joshua Kutryk	ScreenName: Astro_Kutryk	Description:Canadian Space Agency Astronaut and RCAF T

### Exercice 5

[Congreso.es](http://www.congreso.es/) is the Spain's Congress of Deputies website. Inside this [website](https://www.congreso.es/web/guest/hemiciclo), there is a representation of the hemicycle together with each deputy's location,  fotograph, territorial representation and political party. 

Use `scrappy` to extract (using an `xpath`) the following information as dictionary. For example:

`{ 'Nombre': 'Carrerons Cano, Juan Antonio', 'Territorio': 'Diputat per Ciudad Real', 'Partido': 'G.P. Popular al Congrés ',' url ':' /wc/htdocs/web/img/diputados/peq/35_14.jpg '} `


In [7]:
import scrapy
from scrapy.crawler import CrawlerProcess

# Create spider
class deputies_spider(scrapy.Spider):

    # Spider's name
    name = "congreso_spider"

    # url
    start_urls = ["https://www.congreso.es/web/guest/hemiciclo"]
    xpath = ' '
    
    def parse(self, response):
        # Extract the information
        xpath = '//area/@onmouseover'
        for dip in response.xpath(xpath):
            _, iurl, _, _, _, Nom, _, Territori, _, Partit, _, _, _, _, _ = dip.extract().split("'")
            yield {
                'Nombre': Nom.split('(')[0],
                'Territorio': Territori,
                'Partido': Partit,
                'url': iurl
            }

In [8]:
if __name__ == "__main__":

    # Create Crawler
    process = CrawlerProcess({
        'USER_AGENT': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)',
        'DOWNLOAD_HANDLERS': {'s3': None},
        'LOG_ENABLED': True
    })

    process.crawl(deputies_spider)

    # Start process
    process.start()

2021-02-08 09:20:49 [scrapy.utils.log] INFO: Scrapy 2.4.1 started (bot: scrapybot)
2021-02-08 09:20:49 [scrapy.utils.log] INFO: Versions: lxml 4.6.1.0, libxml2 2.9.10, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 20.3.0, Python 3.8.5 (default, Sep  3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)], pyOpenSSL 19.1.0 (OpenSSL 1.1.1i  8 Dec 2020), cryptography 3.1.1, Platform Windows-10-10.0.18362-SP0
2021-02-08 09:20:49 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor
2021-02-08 09:20:49 [scrapy.crawler] INFO: Overridden settings:
{'USER_AGENT': 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)'}
2021-02-08 09:20:49 [scrapy.extensions.telnet] INFO: Telnet Password: bf15028f27d574ab
2021-02-08 09:20:49 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.logstats.LogStats']
2021-02-08 09:20:50 [scrapy.middleware] INFO: Enabled downloader middlewares:

2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'González Laya, María Aránzazu ', 'Territorio': 'Diputado por ', 'Partido': 'G. P. ', 'url': '/docu/imgweb/diputados/503_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Grande-Marlaska Gómez, Fernando ', 'Territorio': 'Diputado por ', 'Partido': 'G. P. ', 'url': '/docu/imgweb/diputados/509_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Celaá Diéguez, Isabel ', 'Territorio': 'Diputado por ', 'Partido': 'G. P. ', 'url': '/docu/imgweb/diputados/510_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Maroto Illera, Reyes ', 'Territorio': 'Diputado por ', 'Partido': 'G. P. ', 'url': '/docu/imgweb/diputados/511_14.jpg'}
202

2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Gutiérrez Prieto, Sergio ', 'Territorio': 'Diputado por Toledo', 'Partido': 'G. P. Socialista', 'url': '/docu/imgweb/diputados/180_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'López Álvarez, Patxi ', 'Territorio': 'Diputado por Bizkaia', 'Partido': 'G. P. Socialista', 'url': '/docu/imgweb/diputados/61_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Casares Hontañón, Pedro ', 'Territorio': 'Diputado por Cantabria', 'Partido': 'G. P. Socialista', 'url': '/docu/imgweb/diputados/347_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Diouf Dioh, Luc Andre ', 'Territorio': 'Diputado por Palmas (Las)', 'Partido': 'G. P. 

2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Guijarro Ceballos, María ', 'Territorio': 'Diputada por Bizkaia', 'Partido': 'G. P. Socialista', 'url': '/docu/imgweb/diputados/136_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Sánchez Jódar, Marisol ', 'Territorio': 'Diputada por Murcia', 'Partido': 'G. P. Socialista', 'url': '/docu/imgweb/diputados/276_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Baños Ruiz, Carmen ', 'Territorio': 'Diputada por Murcia', 'Partido': 'G. P. Socialista', 'url': '/docu/imgweb/diputados/365_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'García Morís, Roberto ', 'Territorio': 'Diputado por Asturias', 'Partido': 'G. P. Socialist

2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Bel Accensi, Ferran ', 'Territorio': 'Diputado por Tarragona', 'Partido': 'G. P. Plural', 'url': '/docu/imgweb/diputados/14_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Nogueras i Camero, Míriam ', 'Territorio': 'Diputada por Barcelona', 'Partido': 'G. P. Plural', 'url': '/docu/imgweb/diputados/27_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Borràs Castanyer, Laura ', 'Territorio': 'Diputada por Barcelona', 'Partido': 'G. P. Plural', 'url': '/docu/imgweb/diputados/25_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Ramírez Carner, Arnau ', 'Territorio': 'Diputado por Barcelona', 'Partido': 'G. P. Socialista',

2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Rodríguez Rodríguez, Alberto ', 'Territorio': 'Diputado por S/C Tenerife', 'Partido': 'G. P. Confederal de Unidas Podemos-En Comú Podem-Galicia en Común', 'url': '/docu/imgweb/diputados/338_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Belarra Urteaga, Ione ', 'Territorio': 'Diputada por Navarra', 'Partido': 'G. P. Confederal de Unidas Podemos-En Comú Podem-Galicia en Común', 'url': '/docu/imgweb/diputados/139_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Vera Ruíz-Herrera, Noelia ', 'Territorio': 'Diputada por Cádiz', 'Partido': 'G. P. Confederal de Unidas Podemos-En Comú Podem-Galicia en Común', 'url': '/docu/imgweb/diputados/345_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scra

2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Muñoz Dalda, Lucía ', 'Territorio': 'Diputada por Balears (Illes)', 'Partido': 'G. P. Confederal de Unidas Podemos-En Comú Podem-Galicia en Común', 'url': '/docu/imgweb/diputados/335_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Capdevila i Esteve, Joan ', 'Territorio': 'Diputado por Barcelona', 'Partido': 'G. P. Republicano', 'url': '/docu/imgweb/diputados/172_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Vallugera Balañà, Pilar ', 'Territorio': 'Diputada por Barcelona', 'Partido': 'G. P. Republicano', 'url': '/docu/imgweb/diputados/88_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Granollers Cunillera, Inés

2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Trías Gil, Georgina ', 'Territorio': 'Diputada por Ávila', 'Partido': 'G. P. VOX', 'url': '/docu/imgweb/diputados/277_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Sánchez del Real, Víctor Manuel ', 'Territorio': 'Diputado por Badajoz', 'Partido': 'G. P. VOX', 'url': '/docu/imgweb/diputados/195_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Robles López, Joaquín ', 'Territorio': 'Diputado por Murcia', 'Partido': 'G. P. VOX', 'url': '/docu/imgweb/diputados/246_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Asarta Cuevas, Alberto ', 'Territorio': 'Diputado por Castellón/Castelló', 'Partido': 'G. P. VOX', 'url': 

2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Gamarra Ruiz-Clavijo, Concepción ', 'Territorio': 'Diputada por Rioja (La)', 'Partido': 'G. P. Popular en el Congreso', 'url': '/docu/imgweb/diputados/261_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Casado Blanco, Pablo ', 'Territorio': 'Diputado por Madrid', 'Partido': 'G. P. Popular en el Congreso', 'url': '/docu/imgweb/diputados/307_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Ortega Smith-Molina, Francisco Javier ', 'Territorio': 'Diputado por Madrid', 'Partido': 'G. P. VOX', 'url': '/docu/imgweb/diputados/326_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Garriga Vaz de Concicao, Ignacio ', 'Territori

2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Martín Llaguno, Marta ', 'Territorio': 'Diputada por Alicante/Alacant', 'Partido': 'G. P. Ciudadanos', 'url': '/docu/imgweb/diputados/155_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Díaz Gómez, Guillermo ', 'Territorio': 'Diputado por Málaga', 'Partido': 'G. P. Ciudadanos', 'url': '/docu/imgweb/diputados/122_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Méndez Monasterio, Lourdes ', 'Territorio': 'Diputada por Murcia', 'Partido': 'G. P. VOX', 'url': '/docu/imgweb/diputados/177_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Mariscal Zabala, Manuel ', 'Territorio': 'Diputado por Toledo', 'Partido': 'G. P. VOX

2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Zurita Expósito, Ana María ', 'Territorio': 'Diputada por S/C Tenerife', 'Partido': 'G. P. Popular en el Congreso', 'url': '/docu/imgweb/diputados/111_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Aragonés Mendiguchía, Carlos ', 'Territorio': 'Diputado por Madrid', 'Partido': 'G. P. Popular en el Congreso', 'url': '/docu/imgweb/diputados/232_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Mestre Barea, Manuel ', 'Territorio': 'Diputado por Alicante/Alacant', 'Partido': 'G. P. VOX', 'url': '/docu/imgweb/diputados/268_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Nevado del Campo, María Magdalena ', 'Territorio'

2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Navarro López, Pedro ', 'Territorio': 'Diputado por Zaragoza', 'Partido': 'G. P. Popular en el Congreso', 'url': '/docu/imgweb/diputados/220_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Velasco Morillo, Elvira ', 'Territorio': 'Diputada por Zamora', 'Partido': 'G. P. Popular en el Congreso', 'url': '/docu/imgweb/diputados/10_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Merino Martínez, Javier ', 'Territorio': 'Diputado por Rioja (La)', 'Partido': 'G. P. Popular en el Congreso', 'url': '/docu/imgweb/diputados/138_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Cruz-Guzmán García, María Soledad ', 'Territorio'

2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Ledesma Martín, Sebastián Jesús ', 'Territorio': 'Diputado por S/C Tenerife', 'Partido': 'G. P. Popular en el Congreso', 'url': '/docu/imgweb/diputados/110_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Castillo López, Elena ', 'Territorio': 'Diputada por Cantabria', 'Partido': 'G. P. Popular en el Congreso', 'url': '/docu/imgweb/diputados/77_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Bas Corugeira, Javier ', 'Territorio': 'Diputado por Pontevedra', 'Partido': 'G. P. Popular en el Congreso', 'url': '/docu/imgweb/diputados/118_14.jpg'}
2021-02-08 09:20:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.congreso.es/web/guest/hemiciclo>
{'Nombre': 'Jerez Juan, Miguel Ángel ', 'Territ

---

### Optional Exercice

Open Notify has an [API](http://api.open-notify.org) that allows to request information about the humans living outside the Earth. 

Using this API, program a function that prints on screen the actual number of astronauts in the space, the number of tripulated ships in orbit and the name of every astronaut of each spacecraft. 


In [9]:
import requests
import json


def SpaceCrafts():
    # Define url
    url = 'http://api.open-notify.org/astros.json'

    # Inizialize query
    response = requests.get(url)

    # Transform results to a dictionary
    content_dict = json.loads(response.content)
    print("Number of astronauts in orbit: %d" % content_dict["number"])
    naus = dict()

    for astro in content_dict['people']:
        if astro['craft'] in naus:
            naus[astro['craft']].append(astro['name'])
        else:
            naus[astro['craft']] = [astro['name']]
    print("Number of inhabited spacecrafts: %d" % len(naus.keys()))

    print("\nNumber of inhabitants of each craft:")
    for key, value in naus.items():
        print("   En la nau: %s (%d)" % (key, len(value)))
        [print("\t%s" % a) for a in value]

In [10]:
SpaceCrafts()

2021-02-08 09:20:51 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): api.open-notify.org:80
2021-02-08 09:20:52 [urllib3.connectionpool] DEBUG: http://api.open-notify.org:80 "GET /astros.json HTTP/1.1" 200 356


Number of astronauts in orbit: 7
Number of inhabited spacecrafts: 1

Number of inhabitants of each craft:
   En la nau: ISS (7)
	Sergey Ryzhikov
	Kate Rubins
	Sergey Kud-Sverchkov
	Mike Hopkins
	Victor Glover
	Shannon Walker
	Soichi Noguchi
