# Adidas US Sales Datasets

A dataset pertaining to Adidas sales comprises a compilation of information concerning the sales activities of Adidas products. Such datasets typically encompass details like the quantity of products sold, the total revenue generated from sales, the geographical distribution of sales, the specific product types sold, and other pertinent particulars.

The utilization of Adidas sales data can serve diverse objectives, including the analysis of sales patterns, the identification of successful products or marketing endeavors, and the formulation of strategies for future sales endeavors. Additionally, it can facilitate comparisons between Adidas sales and those of competitors, as well as evaluations of the efficacy of various marketing or sales channels.

Numerous potential sources exist from which an Adidas sales dataset could be obtained, including Adidas itself, market research companies, governmental bodies, or other entities involved in the tracking of sales data. The specific data elements encompassed within an Adidas sales dataset might vary based on the originating source and the intended utilization of the data.

<a id="cont"></a>

## Table of Contents

- [1. Import Packages](#one)
- [2. Load Data](#two)

<a id="one"></a>
# 1. Import Packages
[Back to Table of Contents](#cont)

---

In [82]:
import numpy as np
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report

<a id="two"></a>
# 2. Load Data
[Back to Table of Contents](#cont)

---

Convert Excel sheet into a CSV file

In [83]:
excel_file = 'Adidas US Sales Datasets.xlsx' 

column_names = ["Retailer", "Retailer ID","Invoice Date","Region", "State","City", "Product", "Price per Unit",	"Units Sold", "Total Sales", "Operating Profit","Operating Margin","Sales Method"]
df = pd.read_excel(excel_file, header=None, names=column_names)

df = df.dropna(how='all')

csv_file = 'adidas_us_sales.csv'  
df.to_csv(csv_file, index=False)  



In [84]:
adidas_sales_df = pd.read_csv("adidas_us_sales.csv")

In [85]:
adidas_sales_df.head()

Unnamed: 0,Retailer,Retailer ID,Invoice Date,Region,State,City,Product,Price per Unit,Units Sold,Total Sales,Operating Profit,Operating Margin,Sales Method
0,,Adidas Sales Database,,,,,,,,,,,
1,Retailer,Retailer ID,Invoice Date,Region,State,City,Product,Price per Unit,Units Sold,Total Sales,Operating Profit,Operating Margin,Sales Method
2,Foot Locker,1185732,2020-01-01 00:00:00,Northeast,New York,New York,Men's Street Footwear,50,1200,600000,300000,0.5,In-store
3,Foot Locker,1185732,2020-01-02 00:00:00,Northeast,New York,New York,Men's Athletic Footwear,50,1000,500000,150000,0.3,In-store
4,Foot Locker,1185732,2020-01-03 00:00:00,Northeast,New York,New York,Women's Street Footwear,40,1000,400000,140000,0.35,In-store


In [86]:
adidas_sales_df.tail()

Unnamed: 0,Retailer,Retailer ID,Invoice Date,Region,State,City,Product,Price per Unit,Units Sold,Total Sales,Operating Profit,Operating Margin,Sales Method
9645,Foot Locker,1185732,2021-01-24 00:00:00,Northeast,New Hampshire,Manchester,Men's Apparel,50,64,3200,896.0000000000001,0.28,Outlet
9646,Foot Locker,1185732,2021-01-24 00:00:00,Northeast,New Hampshire,Manchester,Women's Apparel,41,105,4305,1377.6,0.32,Outlet
9647,Foot Locker,1185732,2021-02-22 00:00:00,Northeast,New Hampshire,Manchester,Men's Street Footwear,41,184,7544,2791.28,0.37,Outlet
9648,Foot Locker,1185732,2021-02-22 00:00:00,Northeast,New Hampshire,Manchester,Men's Athletic Footwear,42,70,2940,1234.8000000000002,0.42,Outlet
9649,Foot Locker,1185732,2021-02-22 00:00:00,Northeast,New Hampshire,Manchester,Women's Street Footwear,29,83,2407,649.89,0.27,Outlet


In [87]:
adidas_sales_df.shape

(9650, 13)

In [88]:
adidas_sales_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 9650 entries, 0 to 9649
Data columns (total 13 columns):
 #   Column            Non-Null Count  Dtype 
---  ------            --------------  ----- 
 0   Retailer          9649 non-null   object
 1   Retailer ID       9650 non-null   object
 2   Invoice Date      9649 non-null   object
 3   Region            9649 non-null   object
 4   State             9649 non-null   object
 5   City              9649 non-null   object
 6   Product           9649 non-null   object
 7   Price per Unit    9649 non-null   object
 8   Units Sold        9649 non-null   object
 9   Total Sales       9649 non-null   object
 10  Operating Profit  9649 non-null   object
 11  Operating Margin  9649 non-null   object
 12  Sales Method      9649 non-null   object
dtypes: object(13)
memory usage: 980.2+ KB


In [89]:
adidas_sales_df.isnull().sum()

Retailer            1
Retailer ID         0
Invoice Date        1
Region              1
State               1
City                1
Product             1
Price per Unit      1
Units Sold          1
Total Sales         1
Operating Profit    1
Operating Margin    1
Sales Method        1
dtype: int64

In [90]:
null_values = adidas_sales_df[adidas_sales_df['Retailer'].isnull()]

In [91]:
null_values

Unnamed: 0,Retailer,Retailer ID,Invoice Date,Region,State,City,Product,Price per Unit,Units Sold,Total Sales,Operating Profit,Operating Margin,Sales Method
0,,Adidas Sales Database,,,,,,,,,,,


In [92]:
df_cleaned = adidas_sales_df.dropna()

In [93]:
df_cleaned.isnull().sum()

Retailer            0
Retailer ID         0
Invoice Date        0
Region              0
State               0
City                0
Product             0
Price per Unit      0
Units Sold          0
Total Sales         0
Operating Profit    0
Operating Margin    0
Sales Method        0
dtype: int64