### GOAL 

![](json_df_goal.png)

To save it as a json file in local

## Working on Jsons

In [1]:
import requests #Esta librería se utiliza para traer el fichero de internet
import json     #Esta librería trabaja con json
import pandas as pd     #Esta librería permite data mining/wrangling

## 1. 

Read an online json  (not always needed). 

If it is a local json --> 'with open()'


In [3]:
# Con esta función traigo a python el archivo de internet:
r = requests.get(url='https://mdn.github.io/learning-area/javascript/oojs/json/superheroes.json')

json_readed = r.json()
print(type(json_readed))    #Siempre tengo que verificar que es un diccionario

json_readed

<class 'dict'>


{'squadName': 'Super Hero Squad',
 'homeTown': 'Metro City',
 'formed': 2016,
 'secretBase': 'Super tower',
 'active': True,
 'members': [{'name': 'Molecule Man',
   'age': 29,
   'secretIdentity': 'Dan Jukes',
   'powers': ['Radiation resistance', 'Turning tiny', 'Radiation blast']},
  {'name': 'Madame Uppercut',
   'age': 39,
   'secretIdentity': 'Jane Wilson',
   'powers': ['Million tonne punch',
    'Damage resistance',
    'Superhuman reflexes']},
  {'name': 'Eternal Flame',
   'age': 1000000,
   'secretIdentity': 'Unknown',
   'powers': ['Immortality',
    'Heat Immunity',
    'Inferno',
    'Teleportation',
    'Interdimensional travel']}]}

In [39]:
print(json_readed) #Si lo muestro con el print lo veo en horizontal

{'squadName': 'Super Hero Squad', 'homeTown': 'Metro City', 'formed': 2016, 'secretBase': 'Super tower', 'active': True, 'members': [{'name': 'Molecule Man', 'age': 29, 'secretIdentity': 'Dan Jukes', 'powers': ['Radiation resistance', 'Turning tiny', 'Radiation blast']}, {'name': 'Madame Uppercut', 'age': 39, 'secretIdentity': 'Jane Wilson', 'powers': ['Million tonne punch', 'Damage resistance', 'Superhuman reflexes']}, {'name': 'Eternal Flame', 'age': 1000000, 'secretIdentity': 'Unknown', 'powers': ['Immortality', 'Heat Immunity', 'Inferno', 'Teleportation', 'Interdimensional travel']}]}


------------------

## 2. 

Save json in a local file called "data.json"

In [4]:
#Guardo el archivo en local. Se va a guardar en la carpeta en la que estoy en un archivo llamado 'data.json'
with open('data.json', 'w+') as vr:
    json.dump(json_readed, vr)

------------------

## 3. 

Save with indent

In [5]:
# El paso dos guarda el archivo en una sola linea por lo que es difícil de leer. Con indent se hace mas legible, normalmente indent es siempre igual a 4
with open('data_indented.json', 'w+') as outfile:
    json.dump(json_readed, outfile, indent=4)

------------------

## 4.

Read local json


In [9]:
with open('data_indented.json', 'r+') as outfile:
    json_readed = json.load(outfile)
print(type(json_readed))
json_readed

<class 'dict'>


{'squadName': 'Super Hero Squad',
 'homeTown': 'Metro City',
 'formed': 2016,
 'secretBase': 'Super tower',
 'active': True,
 'members': [{'name': 'Molecule Man',
   'age': 29,
   'secretIdentity': 'Dan Jukes',
   'powers': ['Radiation resistance', 'Turning tiny', 'Radiation blast']},
  {'name': 'Madame Uppercut',
   'age': 39,
   'secretIdentity': 'Jane Wilson',
   'powers': ['Million tonne punch',
    'Damage resistance',
    'Superhuman reflexes']},
  {'name': 'Eternal Flame',
   'age': 1000000,
   'secretIdentity': 'Unknown',
   'powers': ['Immortality',
    'Heat Immunity',
    'Inferno',
    'Teleportation',
    'Interdimensional travel']}]}

## 5. 

Transform to pandas DataFrame. Two ways:

In [10]:
df = pd.DataFrame(json_readed)  # Desde el diccionario
df

Unnamed: 0,squadName,homeTown,formed,secretBase,active,members
0,Super Hero Squad,Metro City,2016,Super tower,True,"{'name': 'Molecule Man', 'age': 29, 'secretIde..."
1,Super Hero Squad,Metro City,2016,Super tower,True,"{'name': 'Madame Uppercut', 'age': 39, 'secret..."
2,Super Hero Squad,Metro City,2016,Super tower,True,"{'name': 'Eternal Flame', 'age': 1000000, 'sec..."


In [11]:
df_json = pd.read_json("data_indented.json")    # Si lo tengo local como json
df_json

Unnamed: 0,squadName,homeTown,formed,secretBase,active,members
0,Super Hero Squad,Metro City,2016,Super tower,True,"{'name': 'Molecule Man', 'age': 29, 'secretIde..."
1,Super Hero Squad,Metro City,2016,Super tower,True,"{'name': 'Madame Uppercut', 'age': 39, 'secret..."
2,Super Hero Squad,Metro City,2016,Super tower,True,"{'name': 'Eternal Flame', 'age': 1000000, 'sec..."


------------------

## 6.

### Data Mining & Data Wrangling
As you can see, there are jsons inside the original json. For that, we have to modify the data to be able to use it correctly (data wrangling). 

How do you solve this issue? Research about this and try a solution. 

In [12]:
# Quiero tener una lista, y no un diccionario, en la columna 'members'
type(df_json["members"]) #Estoy accediendo al valor de la clave del diccionario

pandas.core.series.Series

In [13]:
# Esta tabla representa el diccionario dentro del diccionario original. Es el diccionario que antes estaba en 'members'
df_members_json = pd.DataFrame(json_readed["members"])
df_members_json

Unnamed: 0,name,age,secretIdentity,powers
0,Molecule Man,29,Dan Jukes,"[Radiation resistance, Turning tiny, Radiation..."
1,Madame Uppercut,39,Jane Wilson,"[Million tonne punch, Damage resistance, Super..."
2,Eternal Flame,1000000,Unknown,"[Immortality, Heat Immunity, Inferno, Teleport..."


--------------------------------------

**Concatenamos por columnas los dataframes**

In [14]:
# Juntamos los dos dataframes. Ahora tenemos la tabla principal y la nueva tabla del diccionario interno
final_df = pd.concat([df_json, df_members_json], axis=1)
final_df

Unnamed: 0,squadName,homeTown,formed,secretBase,active,members,name,age,secretIdentity,powers
0,Super Hero Squad,Metro City,2016,Super tower,True,"{'name': 'Molecule Man', 'age': 29, 'secretIde...",Molecule Man,29,Dan Jukes,"[Radiation resistance, Turning tiny, Radiation..."
1,Super Hero Squad,Metro City,2016,Super tower,True,"{'name': 'Madame Uppercut', 'age': 39, 'secret...",Madame Uppercut,39,Jane Wilson,"[Million tonne punch, Damage resistance, Super..."
2,Super Hero Squad,Metro City,2016,Super tower,True,"{'name': 'Eternal Flame', 'age': 1000000, 'sec...",Eternal Flame,1000000,Unknown,"[Immortality, Heat Immunity, Inferno, Teleport..."


#### Borramos la columna members

In [15]:
# Como solo estabamos concatenando ahora tenemos que eliminar la columna members porque ahora ya no la necesitamos, sus elementos se han desglosado
final_df = final_df.drop(["members"], axis=1)
final_df

Unnamed: 0,squadName,homeTown,formed,secretBase,active,name,age,secretIdentity,powers
0,Super Hero Squad,Metro City,2016,Super tower,True,Molecule Man,29,Dan Jukes,"[Radiation resistance, Turning tiny, Radiation..."
1,Super Hero Squad,Metro City,2016,Super tower,True,Madame Uppercut,39,Jane Wilson,"[Million tonne punch, Damage resistance, Super..."
2,Super Hero Squad,Metro City,2016,Super tower,True,Eternal Flame,1000000,Unknown,"[Immortality, Heat Immunity, Inferno, Teleport..."


--------------------

# EXTRA

--------------------


In [16]:
# Para guardar en un archivo local
final_df.to_json("json_name.json")

Nota:

1. Con dumps, cargamos el contenido de un diccionario a formato string
2. Con loads, cargamos el contenido de un string a formato json formal.

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_json.html

In [17]:
# Para guardar en un archivo local con indentación
import json
json_result = final_df.to_json(orient="records")
parsed = json.loads(json_result)    #copiar y pegar
with open("final_json.json", 'w+') as outfile:
    json.dump(parsed, outfile, indent=4)

In [21]:
df_f = pd.read_json("final_json.json")
# Cuando alguien importa el json final ya verá la tabla que he creado
df_f

Unnamed: 0,squadName,homeTown,formed,secretBase,active,name,age,secretIdentity,powers
0,Super Hero Squad,Metro City,2016,Super tower,True,Molecule Man,29,Dan Jukes,"[Radiation resistance, Turning tiny, Radiation..."
1,Super Hero Squad,Metro City,2016,Super tower,True,Madame Uppercut,39,Jane Wilson,"[Million tonne punch, Damage resistance, Super..."
2,Super Hero Squad,Metro City,2016,Super tower,True,Eternal Flame,1000000,Unknown,"[Immortality, Heat Immunity, Inferno, Teleport..."


In [23]:
df_f.to_csv("datos_finales.csv")

In [22]:
# Tambien se puede guardar el dataframe en excel
df_f.to_excel("datos_finales.xlsx")

ModuleNotFoundError: No module named 'openpyxl'