![Pandas](Pandas.png)

### o Pandas é uma biblioteca Python que oferece uma ampla gama de funcionalidades para manipulação, limpeza, transformação e análise de dados. Ele permite carregar dados de diferentes formatos, como CSV, Excel, SQL, entre outros. Também permite realizar operações de filtragem, agregação, ordenação e junção de dados, além de lidar com valores ausentes.

### O Pandas é frequentemente usado em conjunto com outras bibliotecas populares de Python, como NumPy, Matplotlib e scikit-learn, para criar fluxos de trabalho completos de análise de dados e aprendizado de máquina.

In [None]:
# !pip install pandas

In [1]:
# Importando a biblioteca
import pandas as pd

In [2]:
dc = {"coluna1": [1,2,3,4,5], "coluna2": [6,7,8,9,10]}
dc

{'coluna1': [1, 2, 3, 4, 5], 'coluna2': [6, 7, 8, 9, 10]}

In [3]:
df = pd.DataFrame(dc)
df

Unnamed: 0,coluna1,coluna2
0,1,6
1,2,7
2,3,8
3,4,9
4,5,10


In [5]:
df["coluna1"]

0    1
1    2
2    3
3    4
4    5
Name: coluna1, dtype: int64

In [4]:
type(df)

pandas.core.frame.DataFrame

In [30]:
# Lendo a base de dados
df = pd.read_excel("Vendas.xlsx", parse_dates=["Order Date"])

In [7]:
# Visualizando as 5 primeiras linhas do conjunto de dados
df.head()

Unnamed: 0,Order ID,Product,Quantity Ordered,Price Each,Order Date,Purchase Address
0,176558,USB-C Charging Cable,2,11.95,04/19/19 08:46,"917 1st St, Dallas, TX 75001"
1,176559,Bose SoundSport Headphones,1,99.99,04/07/19 22:30,"682 Chestnut St, Boston, MA 02215"
2,176560,Google Phone,1,600.0,04/12/19 14:38,"669 Spruce St, Los Angeles, CA 90001"
3,176560,Wired Headphones,1,11.99,04/12/19 14:38,"669 Spruce St, Los Angeles, CA 90001"
4,176561,Wired Headphones,1,11.99,04/30/19 09:27,"333 8th St, Los Angeles, CA 90001"


### Dicionário de Variáveis

* Product - O produto que foi vendido.
* Quantity Ordered - Quantidade pedida é a quantidade total do item no pedido inicial (sem nenhuma alteração).
* Price Each - O preço de cada produto.
* Order Date - Data da Compra.
* Purchase Address - Pode ser Endereço de Cobrança ou  Endereço para Envio.

In [8]:
# Inserindo o número de linhas que deseja visualizar
df.head(10)

Unnamed: 0,Order ID,Product,Quantity Ordered,Price Each,Order Date,Purchase Address
0,176558,USB-C Charging Cable,2,11.95,04/19/19 08:46,"917 1st St, Dallas, TX 75001"
1,176559,Bose SoundSport Headphones,1,99.99,04/07/19 22:30,"682 Chestnut St, Boston, MA 02215"
2,176560,Google Phone,1,600.0,04/12/19 14:38,"669 Spruce St, Los Angeles, CA 90001"
3,176560,Wired Headphones,1,11.99,04/12/19 14:38,"669 Spruce St, Los Angeles, CA 90001"
4,176561,Wired Headphones,1,11.99,04/30/19 09:27,"333 8th St, Los Angeles, CA 90001"
5,176562,USB-C Charging Cable,1,11.95,04/29/19 13:03,"381 Wilson St, San Francisco, CA 94016"
6,176563,Bose SoundSport Headphones,1,99.99,04/02/19 07:46,"668 Center St, Seattle, WA 98101"
7,176564,USB-C Charging Cable,1,11.95,04/12/19 10:58,"790 Ridge St, Atlanta, GA 30301"
8,176565,Macbook Pro Laptop,1,1700.0,04/24/19 10:38,"915 Willow St, San Francisco, CA 94016"
9,176566,Wired Headphones,1,11.99,04/08/19 14:05,"83 7th St, Boston, MA 02215"


In [9]:
# Visualizando as 5 últimas linhas
df.tail()

Unnamed: 0,Order ID,Product,Quantity Ordered,Price Each,Order Date,Purchase Address
185945,259353,AAA Batteries (4-pack),3,2.99,09/17/19 20:56,"840 Highland St, Los Angeles, CA 90001"
185946,259354,iPhone,1,700.0,09/01/19 16:00,"216 Dogwood St, San Francisco, CA 94016"
185947,259355,iPhone,1,700.0,09/23/19 07:39,"220 12th St, San Francisco, CA 94016"
185948,259356,34in Ultrawide Monitor,1,379.99,09/19/19 17:30,"511 Forest St, San Francisco, CA 94016"
185949,259357,USB-C Charging Cable,1,11.95,09/30/19 00:18,"250 Meadow St, San Francisco, CA 94016"


In [10]:
# Inserindo o número de linhas que deseja visualizar
df.tail(8)

Unnamed: 0,Order ID,Product,Quantity Ordered,Price Each,Order Date,Purchase Address
185942,259350,USB-C Charging Cable,1,11.95,09/30/19 13:49,"519 Maple St, San Francisco, CA 94016"
185943,259351,Apple Airpods Headphones,1,150.0,09/01/19 19:43,"981 4th St, New York City, NY 10001"
185944,259352,USB-C Charging Cable,1,11.95,09/07/19 15:49,"976 Forest St, San Francisco, CA 94016"
185945,259353,AAA Batteries (4-pack),3,2.99,09/17/19 20:56,"840 Highland St, Los Angeles, CA 90001"
185946,259354,iPhone,1,700.0,09/01/19 16:00,"216 Dogwood St, San Francisco, CA 94016"
185947,259355,iPhone,1,700.0,09/23/19 07:39,"220 12th St, San Francisco, CA 94016"
185948,259356,34in Ultrawide Monitor,1,379.99,09/19/19 17:30,"511 Forest St, San Francisco, CA 94016"
185949,259357,USB-C Charging Cable,1,11.95,09/30/19 00:18,"250 Meadow St, San Francisco, CA 94016"


In [11]:
# Visualizando o total de linhas e colunas
df.shape

(185950, 6)

In [12]:
df.shape[0]

185950

In [13]:
df.shape[1]

6

In [14]:
# Retornando apenas os nomes das colunas
df.columns

Index(['Order ID', 'Product', 'Quantity Ordered', 'Price Each', 'Order Date',
       'Purchase Address'],
      dtype='object')

In [16]:
# Tipo de dado de cada coluna
df.dtypes

Order ID              int64
Product              object
Quantity Ordered      int64
Price Each          float64
Order Date           object
Purchase Address     object
dtype: object

In [17]:
# outra opção
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 185950 entries, 0 to 185949
Data columns (total 6 columns):
 #   Column            Non-Null Count   Dtype  
---  ------            --------------   -----  
 0   Order ID          185950 non-null  int64  
 1   Product           185950 non-null  object 
 2   Quantity Ordered  185950 non-null  int64  
 3   Price Each        185950 non-null  float64
 4   Order Date        185950 non-null  object 
 5   Purchase Address  185950 non-null  object 
dtypes: float64(1), int64(2), object(3)
memory usage: 8.5+ MB


In [19]:
# Visualizando apenas uma coluna
df["Product"].head(20)

0           USB-C Charging Cable
1     Bose SoundSport Headphones
2                   Google Phone
3               Wired Headphones
4               Wired Headphones
5           USB-C Charging Cable
6     Bose SoundSport Headphones
7           USB-C Charging Cable
8             Macbook Pro Laptop
9               Wired Headphones
10                  Google Phone
11      Lightning Charging Cable
12        27in 4K Gaming Monitor
13         AA Batteries (4-pack)
14      Lightning Charging Cable
15      Apple Airpods Headphones
16          USB-C Charging Cable
17                  Google Phone
18          USB-C Charging Cable
19        AAA Batteries (4-pack)
Name: Product, dtype: object

In [20]:
# Visualizando os valores únicos da coluna Product
df["Product"].unique()

array(['USB-C Charging Cable', 'Bose SoundSport Headphones',
       'Google Phone', 'Wired Headphones', 'Macbook Pro Laptop',
       'Lightning Charging Cable', '27in 4K Gaming Monitor',
       'AA Batteries (4-pack)', 'Apple Airpods Headphones',
       'AAA Batteries (4-pack)', 'iPhone', 'Flatscreen TV',
       '27in FHD Monitor', '20in Monitor', 'LG Dryer', 'ThinkPad Laptop',
       'Vareebadd Phone', 'LG Washing Machine', '34in Ultrawide Monitor'],
      dtype=object)

In [21]:
df["Product"].nunique()

19

In [23]:
# Selecionando uma amostra dos dados
df.sample(10)

Unnamed: 0,Order ID,Product,Quantity Ordered,Price Each,Order Date,Purchase Address
116952,173714,AA Batteries (4-pack),1,3.84,03/21/19 18:16,"357 Pine St, San Francisco, CA 94016"
64849,159725,Lightning Charging Cable,1,14.95,02/19/19 10:15,"476 South St, Dallas, TX 75001"
70065,143944,Lightning Charging Cable,1,14.95,01/06/19 18:05,"958 2nd St, San Francisco, CA 94016"
152319,294017,USB-C Charging Cable,2,11.95,11/18/19 21:08,"969 Jackson St, Los Angeles, CA 90001"
169733,274400,Lightning Charging Cable,1,14.95,10/28/19 17:58,"860 South St, New York City, NY 10001"
165920,270749,Flatscreen TV,1,300.0,10/21/19 17:58,"834 1st St, Dallas, TX 75001"
140045,282254,AA Batteries (4-pack),2,3.84,11/23/19 10:09,"583 Ridge St, Austin, TX 73301"
165926,270755,AA Batteries (4-pack),1,3.84,10/24/19 18:45,"305 North St, Seattle, WA 98101"
48748,313424,AA Batteries (4-pack),1,3.84,12/04/19 15:29,"859 Main St, New York City, NY 10001"
171524,276114,Bose SoundSport Headphones,1,99.99,10/21/19 13:49,"570 Church St, Atlanta, GA 30301"


In [25]:
df.groupby("Product")["Quantity Ordered"].sum().sort_values(ascending=False)

Product
AAA Batteries (4-pack)        31017
AA Batteries (4-pack)         27635
USB-C Charging Cable          23975
Lightning Charging Cable      23217
Wired Headphones              20557
Apple Airpods Headphones      15661
Bose SoundSport Headphones    13457
27in FHD Monitor               7550
iPhone                         6849
27in 4K Gaming Monitor         6244
34in Ultrawide Monitor         6199
Google Phone                   5532
Flatscreen TV                  4819
Macbook Pro Laptop             4728
ThinkPad Laptop                4130
20in Monitor                   4129
Vareebadd Phone                2068
LG Washing Machine              666
LG Dryer                        646
Name: Quantity Ordered, dtype: int64

In [26]:
df["Order Date"] = pd.to_datetime(df["Order Date"])

In [27]:
df.dtypes

Order ID                     int64
Product                     object
Quantity Ordered             int64
Price Each                 float64
Order Date          datetime64[ns]
Purchase Address            object
dtype: object