# Analysis of Motorcycles Sales Data

## Data Transformation

In [102]:
import pandas as pd

In [103]:
sales = pd.read_csv("data/sales_data.csv")
sales.head()

Unnamed: 0,date,warehouse,client_type,product_line,quantity,unit_price,total,payment
0,1/6/2021,Central,Retail,Miscellaneous,8,16.85,134.83,Credit card
1,1/6/2021,North,Retail,Breaking system,9,19.29,173.61,Cash
2,1/6/2021,North,Retail,Suspension & traction,8,32.93,263.45,Credit card
3,1/6/2021,North,Wholesale,Frame & body,16,37.84,605.44,Transfer
4,1/6/2021,Central,Retail,Engine,2,60.48,120.96,Credit card


In [104]:
# Convert the 'date' column to a datetime object
sales['date'] = pd.to_datetime(sales['date'])

# Set the 'date' column as the index of the DataFrame
#sales = sales.set_index('date')


Parsing dates in DD/MM/YYYY format when dayfirst=False (the default) was specified. This may lead to inconsistently parsed dates! Specify a format to ensure consistent parsing.



In [105]:
# Create new columns for the weekday and month
sales['Weekday'] = sales['date'].dt.day_name()
sales['Month'] = sales['date'].dt.month_name()

## Exploratory Data Analysis

### Daily Revenue

In [106]:
# Group the DataFrame by the 'Date' column
grouped_sales = sales.groupby('date').sum().reset_index()


The default value of numeric_only in DataFrameGroupBy.sum is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.



In [107]:
grouped_sales

Unnamed: 0,date,quantity,unit_price,total
0,2021-01-06,153,414.99,5378.29
1,2021-01-07,157,422.23,4511.43
2,2021-01-08,179,390.35,4599.58
3,2021-02-06,48,231.18,1297.14
4,2021-02-07,100,213.79,4125.26
...,...,...,...,...
84,2021-11-07,143,276.02,3157.42
85,2021-11-08,69,248.75,1697.49
86,2021-12-06,101,320.99,3256.23
87,2021-12-07,99,296.88,3257.05


In [108]:
import plotly.express as px

# Plot the DataFrame
fig = px.line(grouped_sales, x='date', y='total', title='Daily Revenue')

# Show the plot
fig.show()