<a href="https://colab.research.google.com/github/cristiandarioortegayubro/BA/blob/main/pd_03.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

![logo](https://github.com/cristiandarioortegayubro/BA/blob/main/dba.png?raw=true)


![pandas](https://upload.wikimedia.org/wikipedia/commons/thumb/e/ed/Pandas_logo.svg/250px-Pandas_logo.svg.png)

[Graficos de Pandas](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.plot.html?highlight=plot#pandas.DataFrame.plot)

## **Instalando módulos necesarios**

In [1]:
!pip install pandas_datareader --upgrade



## **Módulos necesarios**

### ***Para análisis de datos***

In [2]:
import pandas as pd
import numpy as np
import pandas_datareader as dtr

## **Obtención de datos**

In [3]:
datos = "https://raw.githubusercontent.com/cristiandarioortegayubro/BA/main/Datasets/Analisis_de_Facturas_de_Venta.csv"
df = pd.read_csv(datos)
df["FECHA"] = pd.to_datetime(df.FECHA, infer_datetime_format=True)
df.drop(columns=["DOCUMENTO","TIPOFACTURA","PRECIO", "CANTIDAD"], inplace=True)
df = df[df.MONEDA == "Pesos Argentinos"]
df

Unnamed: 0,ORGANIZACION,FECHA,PRODUCTO,IMPORTEMONTRANSACCION,MONEDA
0,Etigand S.A.,2019-12-27,Consultoria Web,289256.20,Pesos Argentinos
1,Javier Moroni,2019-12-21,Honorarios,99586.78,Pesos Argentinos
2,Amazon Group,2019-10-12,Parlantes para PC,59700.00,Pesos Argentinos
3,Amazon Group,2019-06-12,Mercaderia de Reventa,310740.00,Pesos Argentinos
4,Manganello S.R.L,2019-11-20,Consultoria Web,315000.00,Pesos Argentinos
...,...,...,...,...,...
154,Hernandez S.A.,2018-01-31,Dise–o Web,31500.00,Pesos Argentinos
155,Etigand S.A.,2018-01-25,Impresora Multifuncion,28429.75,Pesos Argentinos
156,Etigand S.A.,2018-01-25,Monitores,26446.28,Pesos Argentinos
157,Ferrari Hnos S.A.,2018-01-01,Impresora Multifuncion,35537.19,Pesos Argentinos


## **Agrupación**

La agrupación es un proceso que posee algunos de los siguientes pasos:

- **Dividir** en grupos basados en algún criterio.
- **Aplicar** una función a cada grupo en forma independiente.
- **Combinar** los resultados en una estructura de datos, como el DataFrame.

In [4]:
clientes = df.groupby("ORGANIZACION").sum()
clientes

Unnamed: 0_level_0,IMPORTEMONTRANSACCION
ORGANIZACION,Unnamed: 1_level_1
Amazon Group,795766.0
Berker S.R.L,262700.0
Campomas S.A.,648262.0
Donadelli S.R.L,269210.0
Etigand S.A.,652421.49
Ferrari Hnos S.A.,320756.2
Galardon S.A.,474697.0
Hernandez S.A.,617565.0
Horacio Aguirre,5000.0
Horizonte S.A.,482835.0


In [5]:
dicc = df.groupby("ORGANIZACION").groups
dicc

{'Amazon Group': [2, 3, 9, 45, 64, 65, 66, 73, 74, 75, 87, 88, 92], 'Berker S.R.L': [22, 37, 38, 39, 94, 137, 153], 'Campomas S.A.': [15, 16, 20, 29, 35, 46, 95], 'Donadelli S.R.L': [53, 104, 112, 126, 129, 147], 'Etigand S.A.': [0, 50, 103, 111, 127, 140, 152, 155, 156], 'Ferrari Hnos S.A.': [17, 30, 136, 157, 158], 'Galardon S.A.': [5, 31, 100, 128, 138, 141, 146], 'Hernandez S.A.': [13, 24, 34, 47, 56, 60, 76, 78, 96, 106, 114, 124, 143, 154], 'Horacio Aguirre': [77], 'Horizonte S.A.': [8, 44, 52, 142, 151], 'Ignition S.A.C.I': [23, 79, 80, 86, 135, 149, 150], 'Jameson SRL': [18, 36, 59, 62, 70, 101, 148], 'Javier Moroni': [1, 21, 26, 40, 51, 55, 71, 105, 109, 113, 117, 119, 125, 130, 139, 145], 'Juan Fernández': [7, 11, 14], 'Juan Lopez': [32, 42], 'Klarkson': [6, 48, 58, 63, 93], 'Manganello S.R.L': [4, 10, 19, 25, 43, 54, 61, 68, 69, 85, 107, 115, 118, 121, 122, 123], 'Nicolasen y Asociados S.A.': [12, 27, 33, 41, 49, 57, 67, 72, 91, 102, 108, 110, 116, 120, 132, 133, 134, 144], 

In [6]:
dicc.keys()

dict_keys(['Amazon Group', 'Berker S.R.L', 'Campomas S.A.', 'Donadelli S.R.L', 'Etigand S.A.', 'Ferrari Hnos S.A.', 'Galardon S.A.', 'Hernandez S.A.', 'Horacio Aguirre', 'Horizonte S.A.', 'Ignition S.A.C.I', 'Jameson SRL', 'Javier Moroni', 'Juan Fernández', 'Juan Lopez', 'Klarkson', 'Manganello S.R.L', 'Nicolasen y Asociados S.A.', 'Rodrigo Vidal', 'Valeria Welponer'])

In [7]:
for name, group in df.groupby("ORGANIZACION"):
  print(name)
  print(group)

Amazon Group
    ORGANIZACION      FECHA                 PRODUCTO  IMPORTEMONTRANSACCION  \
2   Amazon Group 2019-10-12        Parlantes para PC                59700.0   
3   Amazon Group 2019-06-12    Mercaderia de Reventa               310740.0   
9   Amazon Group 2019-09-30  Servicios Profesionales                 6700.0   
45  Amazon Group 2019-03-21  Servicios Profesionales                30000.0   
64  Amazon Group 2018-12-23               Dise–o Web                65800.0   
65  Amazon Group 2018-12-23   Publicidad y Marketing                31500.0   
66  Amazon Group 2018-12-23          Insumos oficina                 3500.0   
73  Amazon Group 2018-08-11   Publicidad y Marketing               160100.0   
74  Amazon Group 2018-08-11          Consultoria Web                 6300.0   
75  Amazon Group 2018-08-11             Modem Router                 5676.0   
87  Amazon Group 2018-05-10               Honorarios                50550.0   
88  Amazon Group 2018-05-10   Publicida

## **Obtención de datos financieros**

In [8]:
btc_enero = dtr.DataReader("BTC-USD", data_source="yahoo", start="2021-01-01", end="2021-01-31")
btc_enero.drop(columns=["Volume","Adj Close"], inplace=True)
btc_enero.reset_index(inplace=True)
btc_enero.head()

Unnamed: 0,Date,High,Low,Open,Close
0,2021-01-01,29600.626953,28803.585938,28994.009766,29374.152344
1,2021-01-02,33155.117188,29091.181641,29376.455078,32127.267578
2,2021-01-03,34608.558594,32052.316406,32129.408203,32782.023438
3,2021-01-04,33440.21875,28722.755859,32810.949219,31971.914062
4,2021-01-05,34437.589844,30221.1875,31977.041016,33992.429688


In [9]:
btc_enero.shape

(32, 5)

In [10]:
btc_febrero = dtr.DataReader("BTC-USD", data_source="yahoo", start="2021-02-01", end="2021-02-28")
btc_febrero.drop(columns=["Volume","Adj Close"], inplace=True)
btc_febrero.reset_index(inplace=True)
btc_febrero.head()

Unnamed: 0,Date,High,Low,Open,Close
0,2021-02-01,34638.214844,32384.228516,33114.578125,33537.175781
1,2021-02-02,35896.882812,33489.21875,33533.199219,35510.289062
2,2021-02-03,37480.1875,35443.984375,35510.820312,37472.089844
3,2021-02-04,38592.175781,36317.5,37475.105469,36926.066406
4,2021-02-05,38225.90625,36658.761719,36931.546875,38144.308594


In [11]:
btc_febrero.shape

(29, 5)

In [12]:
btc_marzo = dtr.DataReader("BTC-USD", data_source="yahoo", start="2021-03-01", end="2021-03-31")
btc_marzo.drop(columns=["Volume","Adj Close"], inplace=True)
btc_marzo.reset_index(inplace=True)
btc_marzo.head()

Unnamed: 0,Date,High,Low,Open,Close
0,2021-03-01,49784.015625,45115.09375,45159.503906,49631.242188
1,2021-03-02,50127.511719,47228.84375,49612.105469,48378.988281
2,2021-03-03,52535.136719,48274.320312,48415.816406,50538.242188
3,2021-03-04,51735.089844,47656.929688,50522.304688,48561.167969
4,2021-03-05,49396.429688,46542.515625,48527.03125,48927.304688


In [13]:
btc_marzo.shape

(32, 5)

## **Concatenación**

In [14]:
btc = [btc_enero, btc_febrero, btc_marzo]

In [15]:
bitcoin = pd.concat(btc)

In [16]:
bitcoin

Unnamed: 0,Date,High,Low,Open,Close
0,2021-01-01,29600.626953,28803.585938,28994.009766,29374.152344
1,2021-01-02,33155.117188,29091.181641,29376.455078,32127.267578
2,2021-01-03,34608.558594,32052.316406,32129.408203,32782.023438
3,2021-01-04,33440.218750,28722.755859,32810.949219,31971.914062
4,2021-01-05,34437.589844,30221.187500,31977.041016,33992.429688
...,...,...,...,...,...
27,2021-03-28,56610.312500,55071.113281,55974.941406,55950.746094
28,2021-03-29,58342.097656,55139.339844,55947.898438,57750.199219
29,2021-03-30,59447.222656,57251.550781,57750.132812,58917.691406
30,2021-03-31,59930.027344,57726.417969,58930.277344,58918.832031


##**Duplicados**

In [17]:
btc_enero.tail()

Unnamed: 0,Date,High,Low,Open,Close
27,2021-01-28,33858.3125,30023.207031,30441.041016,33466.097656
28,2021-01-29,38406.261719,32064.814453,34318.671875,34316.386719
29,2021-01-30,34834.707031,32940.1875,34295.933594,34269.523438
30,2021-01-31,34288.332031,32270.175781,34270.878906,33114.359375
31,2021-02-01,34638.214844,32384.228516,33114.578125,33537.175781


In [18]:
btc_febrero.head()

Unnamed: 0,Date,High,Low,Open,Close
0,2021-02-01,34638.214844,32384.228516,33114.578125,33537.175781
1,2021-02-02,35896.882812,33489.21875,33533.199219,35510.289062
2,2021-02-03,37480.1875,35443.984375,35510.820312,37472.089844
3,2021-02-04,38592.175781,36317.5,37475.105469,36926.066406
4,2021-02-05,38225.90625,36658.761719,36931.546875,38144.308594


In [19]:
btc_febrero.tail()

Unnamed: 0,Date,High,Low,Open,Close
24,2021-02-25,51948.96875,47093.851562,49709.082031,47093.851562
25,2021-02-26,48370.785156,44454.84375,47180.464844,46339.761719
26,2021-02-27,48253.269531,45269.027344,46344.773438,46188.453125
27,2021-02-28,46716.429688,43241.617188,46194.015625,45137.769531
28,2021-03-01,49784.015625,45115.09375,45159.503906,49631.242188


In [20]:
btc_marzo.head()

Unnamed: 0,Date,High,Low,Open,Close
0,2021-03-01,49784.015625,45115.09375,45159.503906,49631.242188
1,2021-03-02,50127.511719,47228.84375,49612.105469,48378.988281
2,2021-03-03,52535.136719,48274.320312,48415.816406,50538.242188
3,2021-03-04,51735.089844,47656.929688,50522.304688,48561.167969
4,2021-03-05,49396.429688,46542.515625,48527.03125,48927.304688


In [21]:
bitcoin.drop_duplicates()

Unnamed: 0,Date,High,Low,Open,Close
0,2021-01-01,29600.626953,28803.585938,28994.009766,29374.152344
1,2021-01-02,33155.117188,29091.181641,29376.455078,32127.267578
2,2021-01-03,34608.558594,32052.316406,32129.408203,32782.023438
3,2021-01-04,33440.218750,28722.755859,32810.949219,31971.914062
4,2021-01-05,34437.589844,30221.187500,31977.041016,33992.429688
...,...,...,...,...,...
27,2021-03-28,56610.312500,55071.113281,55974.941406,55950.746094
28,2021-03-29,58342.097656,55139.339844,55947.898438,57750.199219
29,2021-03-30,59447.222656,57251.550781,57750.132812,58917.691406
30,2021-03-31,59930.027344,57726.417969,58930.277344,58918.832031
