# Eniac

## Importing Data

Turning them into a dictionary of dataframes.

In [29]:
import pandas as pd

# URLs for raw content of the CSV files on GitHub
orders_url = "https://raw.githubusercontent.com/MerleSt/Eniac/main/Data-Eniac/orders.csv"
orderlines_url = "https://raw.githubusercontent.com/MerleSt/Eniac/main/Data-Eniac/orderlines.csv"
products_url = "https://raw.githubusercontent.com/MerleSt/Eniac/main/Data-Eniac/products.csv"
brands_url = "https://raw.githubusercontent.com/MerleSt/Eniac/main/Data-Eniac/brands.csv"


# Loading dataframes directly from GitHub
orders = pd.read_csv(orders_url)
orderlines = pd.read_csv(orderlines_url)
products = pd.read_csv(products_url)
brands = pd.read_csv(brands_url)

- DataFrame **.describe()** gives basic numerical aggregations. It can be applied to a single column as well.
- DataFrame **.isna().any()** highlights which columns contain missing data
- DataFrame **.shape** gives the number of rows and columns
- DataFrame **.columns** gives the column names. Note that a list with new names can be passed to this attribute to rename the columns.
- DataFrame **.columnName.isna().sum()** is a quick way to check the number of missing values in a column
- DataFrame **.columnName.value_counts()** is a great way to summarise a categorical column. You can use it to discover how many orders are completed, cancelled, pending…
- DataFrame **.columnName.hist()** is an easy way to plot a histogram in a numerical column. Play with the bins argument to change the granularity of the graph.

In [35]:
result = products[products['price'].str.count('\.') == 2]
result.tail(50)

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type
17076,SPE0195,Speck CandyShell Case Fit Apple Watch 38mm Bla...,rigid housing with compact design and double l...,1.999.041,199.904,0,2434
17310,SYN0174,Synology DS1817 NAS server Mac and PC,NAS with 4GB of RAM and 8 bays compatible with...,9.789.904,9.789.904,0,12175397
17359,LOG0193-A,Open - Logitech QWERTY keyboard cover Spanish ...,Cover with backlit keyboard and smart plug for...,1.499.904,1.299.901,0,1298
17365,APP2133-A,"Like new - Apple iPad Pro 12.9 ""Wi-Fi + Cellul...",iPad Pro 12.9 inch Wi-Fi refurbished 512GB Silver,14.490.004,12.748.614,0,1298
17392,OWC0190-A,Open - Mac OWC Memory 16GB 1333MHZ DDR3 DIMM,16GB RAM for Mac Pro 2010/2012 Reconditioned,1.199.897,967.572,0,1364
17485,APP2493,Apple TV 32GB 4K,Apple multimedia player with 4K resolution and...,1.990.002,194,1,113464259
17486,APP2494,Apple TV 4K 64GB,Apple multimedia player with 4K resolution and...,21.900.032,2.190.003,1,113464259
17487,APP2490,Apple iPhone 64GB X Silver,New Apple iPhone 64GB Free Silver X,115.900.092,11.590.009,1,113271716
17489,APP2491,Apple iPhone X 256GB Space Gray,New Apple iPhone X 256GB Free Space Gray,13.290.011,13.290.011,1,113271716
17490,APP2492,Apple iPhone X 256GB Silver,New Apple iPhone X 256GB Silver Free,13.290.011,13.290.011,1,113271716


```products['price'] = products['price'].str.replace('.', '', regex=False).astype(float)```

In [13]:
orderlines.dtypes

id                   int64
id_order             int64
product_id           int64
product_quantity     int64
sku                 object
unit_price          object
date                object
dtype: object

In [14]:
orderlines['date']  = pd.to_datetime(orderlines['date'])

In [15]:
orderlines['unit_price'] = orderlines['unit_price'].str.replace('.', '', regex=False).astype(float)

In [16]:
orderlines['id']= orderlines['id'].astype(str)

In [17]:
orderlines['id_order']= orderlines['id_order'].astype(str)

In [18]:
orderlines.drop('product_id', axis=1, inplace=True)

In [19]:
orderlines

Unnamed: 0,id,id_order,product_quantity,sku,unit_price,date
0,1119109,299539,1,OTT0133,1899.0,2017-01-01 00:07:19
1,1119110,299540,1,LGE0043,39900.0,2017-01-01 00:19:45
2,1119111,299541,1,PAR0071,47405.0,2017-01-01 00:20:57
3,1119112,299542,1,WDT0315,6839.0,2017-01-01 00:51:40
4,1119113,299543,1,JBL0104,2374.0,2017-01-01 01:06:38
...,...,...,...,...,...,...
293978,1650199,527398,1,JBL0122,4299.0,2018-03-14 13:57:25
293979,1650200,527399,1,PAC0653,14158.0,2018-03-14 13:57:34
293980,1650201,527400,2,APP0698,999.0,2018-03-14 13:57:41
293981,1650202,527388,1,BEZ0204,1999.0,2018-03-14 13:58:01


change id, order_id, into objects. drop product_id sinc eno longer in use, change unit_price to float and date to date

In [20]:
products.dtypes

sku            object
name           object
desc           object
price          object
promo_price    object
in_stock        int64
type           object
dtype: object

change price to float, promo_price to float, in_stock to boolean

In [21]:
products['price'] = products['price'].str.replace('.', '', regex=False).astype(float)

In [22]:
products['promo_price'] = products['promo_price'].str.replace('.', '', regex=False).astype(float)

In [23]:
products['in_stock'] = products['in_stock'].astype(bool)

In [24]:
brands.dtypes

short    object
long     object
dtype: object

In [25]:
brands.head()

Unnamed: 0,short,long
0,8MO,8Mobility
1,ACM,Acme
2,ADN,Adonit
3,AII,Aiino
4,AKI,Akitio


In [26]:
products.head()

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type
0,RAI0007,Silver Rain Design mStand Support,Aluminum support compatible with all MacBook,5999.0,499899.0,True,8696
1,APP0023,Apple Mac Keyboard Keypad Spanish,USB ultrathin keyboard Apple Mac Spanish.,59.0,589996.0,False,13855401
2,APP0025,Mighty Mouse Apple Mouse for Mac,mouse Apple USB cable.,59.0,569898.0,False,1387
3,APP0072,Apple Dock to USB Cable iPhone and iPod white,IPhone dock and USB Cable Apple iPod.,25.0,229997.0,False,1230
4,KIN0007,Mac Memory Kingston 2GB 667MHz DDR2 SO-DIMM,2GB RAM Mac mini and iMac (2006/07) MacBook Pr...,3499.0,3199.0,True,1364


In [27]:
products.tail()

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type
19321,BEL0376,Belkin Travel Support Apple Watch Black,compact and portable stand vertically or horiz...,2999.0,269903.0,True,12282
19322,THU0060,"Enroute Thule 14L Backpack MacBook 13 ""Black",Backpack with capacity of 14 liter compartment...,6995.0,649903.0,True,1392
19323,THU0061,"Enroute Thule 14L Backpack MacBook 13 ""Blue",Backpack with capacity of 14 liter compartment...,6995.0,649903.0,True,1392
19324,THU0062,"Enroute Thule 14L Backpack MacBook 13 ""Red",Backpack with capacity of 14 liter compartment...,6995.0,649903.0,False,1392
19325,THU0063,"Enroute Thule 14L Backpack MacBook 13 ""Green",Backpack with capacity of 14 liter compartment...,6995.0,649903.0,True,1392


In [28]:
orderlines

Unnamed: 0,id,id_order,product_quantity,sku,unit_price,date
0,1119109,299539,1,OTT0133,1899.0,2017-01-01 00:07:19
1,1119110,299540,1,LGE0043,39900.0,2017-01-01 00:19:45
2,1119111,299541,1,PAR0071,47405.0,2017-01-01 00:20:57
3,1119112,299542,1,WDT0315,6839.0,2017-01-01 00:51:40
4,1119113,299543,1,JBL0104,2374.0,2017-01-01 01:06:38
...,...,...,...,...,...,...
293978,1650199,527398,1,JBL0122,4299.0,2018-03-14 13:57:25
293979,1650200,527399,1,PAC0653,14158.0,2018-03-14 13:57:34
293980,1650201,527400,2,APP0698,999.0,2018-03-14 13:57:41
293981,1650202,527388,1,BEZ0204,1999.0,2018-03-14 13:58:01
