For this project, you will 
 - choose both an API to obtain data from and a web page to scrape. 
 
 - For the API portion of the project will need:
     - to make calls to your chosen API, 
     - successfully obtain a response, 
     - request data, 
     - convert it into a Pandas data frame, 
     - and export it as a CSV file.
     
For the web scraping portion of the project 
- you will need to 
    - scrape the HTML from your chosen page, 
    - parse the HTML to extract the necessary information,
    - and either save the results to a text (txt) file if it is text or into a CSV file if it is tabular data.
    
Technical Requirements
The technical requirements for this project are as follows:

    - You must obtain data from an API using Python.
    - You must scrape and clean HTML from a web page using Python.
    -The results should be two files - one containing the tabular results of your API request and the other containing the results of your web page scrape.
    - Your code should be saved in a Jupyter Notebook and your results should be saved in a folder named output.
    - You must demonstrate all the topics we covered in the chapter (functions, list comprehensions, string operations, and error handling) in your processing of the data.
    - There should be some data set that gets imported and some result that gets exported.
    - Your code should be saved in a Python executable file (.py), your data should be saved in a folder named data, and your results should be saved in a folder named output.
    - You should include a README.md file that describes the steps you took and your thought process for obtaining data from the API and web page.

In [1]:
import requests
 
url = 'https://www.vice.com/es/section/ediciones-vice'
html = requests.get(url).content
html#[0:600]

b'<!DOCTYPE html><html lang="es" dir="ltr"><head><script>\n/*\n    TCF v2 version, //GDPR Stub file\n    Sourcepoint stub function\n    Last update: https://vicedev.atlassian.net/browse/FE-2649\n*/\n!function () { var e = function () { var e, t = "__tcfapiLocator", a = [], n = window; for (; n;) { try { if (n.frames[t]) { e = n; break } } catch (e) { } if (n === window.top) break; n = n.parent } e || (!function e() { var a = n.document, r = !!n.frames[t]; if (!r) if (a.body) { var i = a.createElement("iframe"); i.style.cssText = "display:none", i.name = t, a.body.appendChild(i) } else setTimeout(e, 5); return !r }(), n.__tcfapi = function () { for (var e, t = arguments.length, n = new Array(t), r = 0; r < t; r++)n[r] = arguments[r]; if (!n.length) return a; if ("setGdprApplies" === n[0]) n.length > 3 && 2 === parseInt(n[1], 10) && "boolean" == typeof n[3] && (e = n[3], "function" == typeof n[2] && n[2]("set", !0)); else if ("ping" === n[0]) { var i = { gdprApplies: e, cmpLoaded: !1, cm

In [2]:
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, "html")
soup

<!DOCTYPE html>
<html dir="ltr" lang="es"><head><script>
/*
    TCF v2 version, //GDPR Stub file
    Sourcepoint stub function
    Last update: https://vicedev.atlassian.net/browse/FE-2649
*/
!function () { var e = function () { var e, t = "__tcfapiLocator", a = [], n = window; for (; n;) { try { if (n.frames[t]) { e = n; break } } catch (e) { } if (n === window.top) break; n = n.parent } e || (!function e() { var a = n.document, r = !!n.frames[t]; if (!r) if (a.body) { var i = a.createElement("iframe"); i.style.cssText = "display:none", i.name = t, a.body.appendChild(i) } else setTimeout(e, 5); return !r }(), n.__tcfapi = function () { for (var e, t = arguments.length, n = new Array(t), r = 0; r < t; r++)n[r] = arguments[r]; if (!n.length) return a; if ("setGdprApplies" === n[0]) n.length > 3 && 2 === parseInt(n[1], 10) && "boolean" == typeof n[3] && (e = n[3], "function" == typeof n[2] && n[2]("set", !0)); else if ("ping" === n[0]) { var i = { gdprApplies: e, cmpLoaded: !1, cmpStatus

In [3]:
tags = ['h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'h7', 'p']
texto = [element.text for element in soup.find_all(tags)]
texto

['Del futuro inviable a los futuros posibles en la Tierra',
 '"No es la tecnología lo que hace posible la vida en el planeta, no es la tecnología la que hará posible el futuro y ningún futuro es viable si pretende construirse solo con tecnología". ',
 'Edición #8: Odio',
 'Glosario del odio: Discursos que violentan desde la A a la Z',
 '¿Es el odio un lenguaje aprendido? ¿Una sensación verbalizada en la que nos educaron? ¿Una condición natural?',
 'Z de ZOILAMÉRICA',
 '"Son los mismos dictadores en la casa y en el país".',
 'Y de YO',
 'Yo, mí, me, conmigo.',
 'Edición 7: FUTURO',
 '¿Podemos pensar en un futuro político diferente en América Latina?: el Proyecto Recambio cree que sí',
 '“El problema es, sobre todo, de oferta: no tenemos políticos preparados entre los cuales elegir. Pero, ¿cómo esperamos que lo estén si no ofrecemos en la sociedad ningún espacio dónde formarse?”.',
 'El futuro de la música está en la raíz',
 'En Ecuador un grupo de jóvenes kichwa, conocidos como Humazapa

In [5]:
url = 'https://www.unitstatistics.com/ssbu/'
html = requests.get(url).content
soup = BeautifulSoup(requests.get('https://www.unitstatistics.com/ssbu/').content, "html")
soup

<!DOCTYPE html>
<html lang="en-US">
<head><script>
  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
  })(window,document,'script','//www.google-analytics.com/analytics.js','ga');

  ga('create', 'UA-24185965-21', 'auto');
  ga('send', 'pageview');

</script>
<meta charset="utf-8"/>
<meta content="width=device-width, initial-scale=1" name="viewport"/>
<link href="http://gmpg.org/xfn/11" rel="profile"/>
<title>Super Smash Bros Ultimate – Unit Statistics</title>
<meta content="max-image-preview:large" name="robots"/>
<link href="//fonts.googleapis.com" rel="dns-prefetch"/>
<link href="//s.w.org" rel="dns-prefetch"/>
<link crossorigin="" href="https://fonts.gstatic.com" rel="preconnect"/>
<link href="https://www.unitstatistics.com/feed/" rel="alternate" title="Unit Statistics » Feed" type="applicatio

In [39]:
table = soup.find_all('table',{'class':'tablepress'})[0]
table

<table class="tablepress tablepress-id-3 tablepress-responsive" id="tablepress-3">
<thead>
<tr class="row-1 odd">
<th class="column-1">Character</th><th class="column-2">Tier (beta)</th><th class="column-3">Weight</th><th class="column-4">Air Speed</th><th class="column-5">Falling Speed</th><th class="column-6">Fast fall</th><th class="column-7">Dash</th><th class="column-8">Run speed</th>
</tr>
</thead>
<tbody class="row-hover">
<tr class="row-2 even">
<td class="column-1"> <img src="https://www.unitstatistics.com/img/ssbu/20px-BayonettaHeadSSBU.png" style="margin-right:5px; margin-bottom:-10px; margin-top:-13px; vertical-align:middle" width="20"/>Bayonetta	</td><td class="column-2">	D	</td><td class="column-3">	81	</td><td class="column-4">	1,019	</td><td class="column-5">	1,770	</td><td class="column-6">	2,832	</td><td class="column-7">	1,936	</td><td class="column-8">	1,760	</td>
</tr>
<tr class="row-3 odd">
<td class="column-1"> <img src="https://www.unitstatistics.com/img/ssbu/20

In [62]:
import re
renglon = table.find_all('tr')

rows = [row.text.strip().split("\n") for row in renglon]
rows = [re.split(r"\t+",i[0]) for i in rows]
rows

[['CharacterTier (beta)WeightAir SpeedFalling SpeedFast fallDashRun speed'],
 ['Bayonetta', 'D', '81', '1,019', '1,770', '2,832', '1,936', '1,760'],
 ['Bowser', 'F', '135', '1,155', '1,770', '2,832', '2,255', '1,971'],
 ['Bowser Jr.', 'G', '108', '1,134', '1,650', '2,640', '1,760', '1,566'],
 ['Captain Falcon', 'E', '104', '1,218', '1,865', '2,984', '1,980', '2,552'],
 ['Charizard', 'A', '116', '1,103', '1,520', '2,432', '2,288', '2,200'],
 ['Chrom', 'B', '95', '1,302', '1,800', '2,880', '2,200', '2,145'],
 ['Cloud', 'D', '100', '1,155', '1,680', '2,688', '2,145', '2,167'],
 ['Corrin', 'D', '98', '1,019', '1,650', '2,640', '1,892', '1,595'],
 ['Daisy', 'D', '89', '1,029', '1,190', '1,904', '1,826', '1,595'],
 ['Dark Pit', 'E', '96', '0,935', '1,480', '2,368', '2,090', '1,828'],
 ['Dark Samus', 'G', '108', '1,103', '1,330', '2,168', '1,870', '1,654'],
 ['Diddy Kong', 'A', '90', '0,924', '1,750', '2,800', '2,090', '2,006'],
 ['Donkey Kong', 'E', '127', '1,208', '1,630', '2,608', '2,090',

In [64]:
colnames = rows[0]
data = rows[1:]
import pandas as pd
df = pd.DataFrame(data, columns= ['Character','Tier (beta)','Weight','Air Speed','Falling Speed','Fast fall','DashRun','speed'])
df

Unnamed: 0,Character,Tier (beta),Weight,Air Speed,Falling Speed,Fast fall,DashRun,speed
0,Bayonetta,D,81,1019,1770,2832,1936,1760
1,Bowser,F,135,1155,1770,2832,2255,1971
2,Bowser Jr.,G,108,1134,1650,2640,1760,1566
3,Captain Falcon,E,104,1218,1865,2984,1980,2552
4,Charizard,A,116,1103,1520,2432,2288,2200
...,...,...,...,...,...,...,...,...
74,Wolf,D,92,1281,1800,2880,2090,1540
75,Yoshi,B,104,1344,1290,2064,1980,2046
76,Young Link,D,88,0966,1800,2880,2090,1749
77,Zelda,F,85,1092,1350,2160,1958,1430


In [87]:
df = df.rename(columns={'Character':'Personaje',
                            'Tier (beta)':'Ranking','Air Speed':'Ligereza',
                            'Falling Speed':'Gravedad','Fast fall':'Caida','DashRun':'Trote','speed':'Velocidad'})
df

Unnamed: 0,Personaje,Ranking,Gordura,Ligereza,Gravedad,Caida,Trote,Velocidad
0,Bayonetta,D,81,1019,1770,2832,1936,1760
1,Bowser,F,135,1155,1770,2832,2255,1971
2,Bowser Jr.,G,108,1134,1650,2640,1760,1566
3,Captain Falcon,E,104,1218,1865,2984,1980,2552
4,Charizard,A,116,1103,1520,2432,2288,2200
...,...,...,...,...,...,...,...,...
74,Wolf,D,92,1281,1800,2880,2090,1540
75,Yoshi,B,104,1344,1290,2064,1980,2046
76,Young Link,D,88,0966,1800,2880,2090,1749
77,Zelda,F,85,1092,1350,2160,1958,1430


In [110]:
print(df.describe(include='object'))

        Personaje Ranking Gordura Ligereza Gravedad  Caida  Trote Velocidad
count          79      79      79       79       79     77     76        76
unique         79       7      35       46       41     41     35        59
top     Bayonetta       F      94    1,155    1,800  2,880  1,815     1,595
freq            1      17       5        6        6      6      7         5


In [104]:
df['Ranking'].describe(),

count     79
unique    35
top       94
freq       5
Name: Gordura, dtype: object

In [105]:
df['Velocidad'].describe()

count        76
unique       59
top       1,595
freq          5
Name: Velocidad, dtype: object

In [106]:
df['Gordura'].describe()

count     79
unique    35
top       94
freq       5
Name: Gordura, dtype: object

In [115]:
TopA=(df[(df['Ranking']=='A')])
TopA

Unnamed: 0,Personaje,Ranking,Gordura,Ligereza,Gravedad,Caida,Trote,Velocidad
4,Charizard,A,116,1103,1520,2432,2288,2200
11,Diddy Kong,A,90,924,1750,2800,2090,2006
15,Falco,A,82,977,1800,2880,2035,1619
24,Ivysaur,A,96,998,1380,2208,1903,1595
28,King K. Rool,A,133,945,1700,2720,1936,1485
39,Meta Knight,A,80,1040,1660,2656,2211,2090
57,Richter,A,107,940,1850,2960,1730,1520
66,Simon,A,107,940,1850,2960,1730,1520
69,Squirtle,A,75,1010,1350,2160,1936,1760


In [122]:
#Lista de los personajes
df.to_csv(r'//Users/alanromero/IH/Proyectos/Personajes.csv')
#Lista de los Personajes Top A
TopA.to_csv(r'//Users/alanromero/IH/Proyectos/Personajes_Top_A.csv')