<h1 align=center> Processando Dados</h1>
<p align=center><img src="https://img.freepik.com/vetores-premium/a-plataforma-esta-processando-dados-por-maquina_18660-693.jpg?w=2000" width=500></p>

<h3>Trabalhando com dados CSV e JSON </h3>

In [2]:
import requests
from bs4 import BeautifulSoup

def get_planet_data():
    html = requests.get("http://localhost:8080/planets.html").text
    soup = BeautifulSoup(html, "lxml")
    planet_trs = soup.html.body.div.table.findAll("tr", {"class": "planet"})

    def to_dict(tr):
        tds = tr.findAll("td")
        planet_data = dict()
        planet_data['Name'] = tds[1].text.strip()
        planet_data['Mass'] = tds[2].text.strip()
        planet_data['Radius'] = tds[3].text.strip()
        planet_data['Description'] = tds[4].text.strip()
        planet_data['MoreInfo'] = tds[5].findAll("a")[0]["href"].strip()
        return planet_data

    planets = [to_dict(tr) for tr in planet_trs]
    return planets


if __name__ == "__main__":
    print(get_planet_data())


[{'Name': 'Mercury', 'Mass': '0.330', 'Radius': '4879', 'Description': 'Named Mercurius by the Romans because it appears to move so swiftly.', 'MoreInfo': 'https://en.wikipedia.org/wiki/Mercury_(planet)'}, {'Name': 'Venus', 'Mass': '4.87', 'Radius': '12104', 'Description': 'Roman name for the goddess of love. This planet was considered to be the brightest and most beautiful planet or star in the heavens. Other civilizations have named it for their god or goddess of love/war.', 'MoreInfo': 'https://en.wikipedia.org/wiki/Venus'}, {'Name': 'Earth', 'Mass': '5.97', 'Radius': '12756', 'Description': "The name Earth comes from the Indo-European base 'er,'which produced the Germanic noun 'ertho,' and ultimately German 'erde,' Dutch 'aarde,' Scandinavian 'jord,' and English 'earth.' Related forms include Greek 'eraze,' meaning 'on the ground,' and Welsh 'erw,' meaning 'a piece of land.'", 'MoreInfo': 'https://en.wikipedia.org/wiki/Earth'}, {'Name': 'Mars', 'Mass': '0.642', 'Radius': '6792', 'D

In [3]:
# Converter em arquivo CSV

import csv

planets = get_planet_data()

with open('planets.csv','w+', newline='') as csvfile:
	writer = csv.writer(csvfile)
	writer.writerow(['Name', 'Mass','Radius','Description','MoreInfo'])
	for planet in planets:
		writer.writerow([planet['Name'],
		                 planet['Mass'],
                         planet['Radius'],
                         planet['Description'],
                         planet['MoreInfo']
		                 ])


In [4]:
# Lendo os dados e armazenados em CSV e recuperando o conteúdo utilizando a Biblioteca REQUESTS

import requests
import csv

planets_data = requests.get('http://localhost:8080/planets.csv').text
planets = planets_data.split('\n')
reader = csv.reader(planets, delimiter=',', quotechar='"')
lines = [line for line in reader][:-1] # Porque a saída do CSV deixa a última linha em branco
for line in lines:
	print(line)

['Name', 'Mass', 'Radius', 'Description', 'MoreInfo']
['Mercury', '0.330', '4879', 'Named Mercurius by the Romans because it appears to move so swiftly.', 'https://en.wikipedia.org/wiki/Mercury_(planet)']
['Venus', '4.87', '12104', 'Roman name for the goddess of love. This planet was considered to be the brightest and most beautiful planet or star in the heavens. Other civilizations have named it for their god or goddess of love/war.', 'https://en.wikipedia.org/wiki/Venus']
['Earth', '5.97', '12756', "The name Earth comes from the Indo-European base 'er,'which produced the Germanic noun 'ertho,' and ultimately German 'erde,' Dutch 'aarde,' Scandinavian 'jord,' and English 'earth.' Related forms include Greek 'eraze,' meaning 'on the ground,' and Welsh 'erw,' meaning 'a piece of land.'", 'https://en.wikipedia.org/wiki/Earth']
['Mars', '0.642', '6792', 'Named by the Romans for their god of war because of its red, bloodlike color. Other civilizations also named this planet from this attri

In [5]:
# Utilizar a biblioteca PANDAS é mais fácil

import pandas as pd
planets_df = pd.read_csv("http://localhost:8080/planets.csv", index_col='Name')
planets_df

Unnamed: 0_level_0,Mass,Radius,Description,MoreInfo
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Mercury,0.33,4879,Named Mercurius by the Romans because it appea...,https://en.wikipedia.org/wiki/Mercury_(planet)
Venus,4.87,12104,Roman name for the goddess of love. This plane...,https://en.wikipedia.org/wiki/Venus
Earth,5.97,12756,The name Earth comes from the Indo-European ba...,https://en.wikipedia.org/wiki/Earth
Mars,0.642,6792,Named by the Romans for their god of war becau...,https://en.wikipedia.org/wiki/Mars
Jupiter,1898.0,142984,The largest and most massive of the planets wa...,https://en.wikipedia.org/wiki/Jupiter
Saturn,568.0,120536,"Roman name for the Greek Cronos, father of Zeu...",https://en.wikipedia.org/wiki/Saturn
Uranus,86.8,51118,"Several astronomers, including Flamsteed and L...",https://en.wikipedia.org/wiki/Uranus
Neptune,102.0,49528,"Neptune was ""predicted"" by John Couch Adams an...",https://en.wikipedia.org/wiki/Neptune
Pluto,0.0146,2370,Pluto was discovered at Lowell Observatory in ...,https://en.wikipedia.org/wiki/Pluto


In [6]:
# Podemos adicionar em um DATAFRAME e depois exportar para CSV

planets = get_planet_data()
planets_df = pd.DataFrame(planets).set_index('Name')
planets_df.to_csv('planets_pandas.csv')

In [7]:
# Para ler os Arquivos:

planets_df= pd.read_csv("http://localhost:8080/planets_pandas.csv", index_col='Name')
planets_df

Unnamed: 0_level_0,Mass,Radius,Description,MoreInfo
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Mercury,0.33,4879,Named Mercurius by the Romans because it appea...,https://en.wikipedia.org/wiki/Mercury_(planet)
Venus,4.87,12104,Roman name for the goddess of love. This plane...,https://en.wikipedia.org/wiki/Venus
Earth,5.97,12756,The name Earth comes from the Indo-European ba...,https://en.wikipedia.org/wiki/Earth
Mars,0.642,6792,Named by the Romans for their god of war becau...,https://en.wikipedia.org/wiki/Mars
Jupiter,1898.0,142984,The largest and most massive of the planets wa...,https://en.wikipedia.org/wiki/Jupiter
Saturn,568.0,120536,"Roman name for the Greek Cronos, father of Zeu...",https://en.wikipedia.org/wiki/Saturn
Uranus,86.8,51118,"Several astronomers, including Flamsteed and L...",https://en.wikipedia.org/wiki/Uranus
Neptune,102.0,49528,"Neptune was ""predicted"" by John Couch Adams an...",https://en.wikipedia.org/wiki/Neptune
Pluto,0.0146,2370,Pluto was discovered at Lowell Observatory in ...,https://en.wikipedia.org/wiki/Pluto


In [9]:
# Convertendo em Arquivo JSON

import json
planets = get_planet_data()
print(json.dumps(planets, indent=4))

[
    {
        "Name": "Mercury",
        "Mass": "0.330",
        "Radius": "4879",
        "Description": "Named Mercurius by the Romans because it appears to move so swiftly.",
        "MoreInfo": "https://en.wikipedia.org/wiki/Mercury_(planet)"
    },
    {
        "Name": "Venus",
        "Mass": "4.87",
        "Radius": "12104",
        "Description": "Roman name for the goddess of love. This planet was considered to be the brightest and most beautiful planet or star in the heavens. Other civilizations have named it for their god or goddess of love/war.",
        "MoreInfo": "https://en.wikipedia.org/wiki/Venus"
    },
    {
        "Name": "Earth",
        "Mass": "5.97",
        "Radius": "12756",
        "Description": "The name Earth comes from the Indo-European base 'er,'which produced the Germanic noun 'ertho,' and ultimately German 'erde,' Dutch 'aarde,' Scandinavian 'jord,' and English 'earth.' Related forms include Greek 'eraze,' meaning 'on the ground,' and Welsh 'erw

In [19]:
# Salvando os dados em um arquivo JSON
with open('planets.json', 'w+') as jsonfile:
	json.dump(planets, jsonfile, indent=4)

In [22]:
# JSON pode ser lido de um servidor com o Requests e convertido em um objeto Python

planets_request = requests.get("http://localhost:8080/planets.json")
print(json.loads(planets_request.text))

[{'Name': 'Mercury', 'Mass': '0.330', 'Radius': '4879', 'Description': 'Named Mercurius by the Romans because it appears to move so swiftly.', 'MoreInfo': 'https://en.wikipedia.org/wiki/Mercury_(planet)'}, {'Name': 'Venus', 'Mass': '4.87', 'Radius': '12104', 'Description': 'Roman name for the goddess of love. This planet was considered to be the brightest and most beautiful planet or star in the heavens. Other civilizations have named it for their god or goddess of love/war.', 'MoreInfo': 'https://en.wikipedia.org/wiki/Venus'}, {'Name': 'Earth', 'Mass': '5.97', 'Radius': '12756', 'Description': "The name Earth comes from the Indo-European base 'er,'which produced the Germanic noun 'ertho,' and ultimately German 'erde,' Dutch 'aarde,' Scandinavian 'jord,' and English 'earth.' Related forms include Greek 'eraze,' meaning 'on the ground,' and Welsh 'erw,' meaning 'a piece of land.'", 'MoreInfo': 'https://en.wikipedia.org/wiki/Earth'}, {'Name': 'Mars', 'Mass': '0.642', 'Radius': '6792', 'D

In [25]:
# Podemos importar no Pandas o JSON e salvar em CSV

planets = get_planet_data()
planets_df = pd.DataFrame(planets).set_index('Name')
planets_df.reset_index().to_json('planets_pandas.json', orient='records')

In [27]:
# Podemos abrir um arquivo JSON no Pandas

planets_df = pd.read_json("http://localhost:8080/planets_pandas.json").set_index('Name')
planets_df

Unnamed: 0_level_0,Mass,Radius,Description,MoreInfo
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Mercury,0.33,4879,Named Mercurius by the Romans because it appea...,https://en.wikipedia.org/wiki/Mercury_(planet)
Venus,4.87,12104,Roman name for the goddess of love. This plane...,https://en.wikipedia.org/wiki/Venus
Earth,5.97,12756,The name Earth comes from the Indo-European ba...,https://en.wikipedia.org/wiki/Earth
Mars,0.642,6792,Named by the Romans for their god of war becau...,https://en.wikipedia.org/wiki/Mars
Jupiter,1898.0,142984,The largest and most massive of the planets wa...,https://en.wikipedia.org/wiki/Jupiter
Saturn,568.0,120536,"Roman name for the Greek Cronos, father of Zeu...",https://en.wikipedia.org/wiki/Saturn
Uranus,86.8,51118,"Several astronomers, including Flamsteed and L...",https://en.wikipedia.org/wiki/Uranus
Neptune,102.0,49528,"Neptune was ""predicted"" by John Couch Adams an...",https://en.wikipedia.org/wiki/Neptune
Pluto,0.0146,2370,Pluto was discovered at Lowell Observatory in ...,https://en.wikipedia.org/wiki/Pluto
