# Exploring Ebay Car Sales Data

## Introduction

The dataset contains data about used cars from eBay Keinanzeigen, a classified section of the German eBay website.

The original dataset can be found in [Kaggle](https://www.kaggle.com/orgesleka/used-cars-database/data). However, for practicing purpose, it has been modified to enshorten it (sampling). Moreover, some dirt has been added, as it was previously cleaned.

The dataset used has been downloaded from the DataQuest guided project.

The data contained includes:
- **dateCrawled** - When this ad was first crawled. All field-values are taken from this date.
- **name** - Name of the car.
- **seller** - Whether the seller is private or a dealer.
- **offerType** - The type of listing
- **price** - The price on the ad to sell the car.
- **abtest** - Whether the listing is included in an A/B test.
- **vehicleType** - The vehicle Type.
- **yearOfRegistration** - The year in which the car was first registered.
- **gearbox** - The transmission type.
- **powerPS** - The power of the car in PS.
- **model** - The car model name.
- **kilometer** - How many kilometers the car has driven.
- **monthOfRegistration** - The month in which the car was first registered.
- **fuelType** - What type of fuel the car uses.
- **brand** - The brand of the car.
- **notRepairedDamage** - If the car has a damage which is not yet repaired.
- **dateCreated** - The date on which the eBay listing was created.
- **nrOfPictures** - The number of pictures in the ad.
- **postalCode** - The postal code for the location of the vehicle.
- **lastSeenOnline** - When the crawler saw this ad last online.

In [1]:
import pandas as pd
import numpy as np

autos = pd.read_csv("my_datasets/ebay_car_sales/autos.csv", encoding="Latin-1")

In [2]:
print(autos.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 50000 entries, 0 to 49999
Data columns (total 20 columns):
dateCrawled            50000 non-null object
name                   50000 non-null object
seller                 50000 non-null object
offerType              50000 non-null object
price                  50000 non-null object
abtest                 50000 non-null object
vehicleType            44905 non-null object
yearOfRegistration     50000 non-null int64
gearbox                47320 non-null object
powerPS                50000 non-null int64
model                  47242 non-null object
odometer               50000 non-null object
monthOfRegistration    50000 non-null int64
fuelType               45518 non-null object
brand                  50000 non-null object
notRepairedDamage      40171 non-null object
dateCreated            50000 non-null object
nrOfPictures           50000 non-null int64
postalCode             50000 non-null int64
lastSeen               50000 non-null obj

In [3]:
print(autos.head())

           dateCrawled                                               name  \
0  2016-03-26 17:47:46                   Peugeot_807_160_NAVTECH_ON_BOARD   
1  2016-04-04 13:38:56         BMW_740i_4_4_Liter_HAMANN_UMBAU_Mega_Optik   
2  2016-03-26 18:57:24                         Volkswagen_Golf_1.6_United   
3  2016-03-12 16:58:10  Smart_smart_fortwo_coupe_softouch/F1/Klima/Pan...   
4  2016-04-01 14:38:50  Ford_Focus_1_6_Benzin_TÜV_neu_ist_sehr_gepfleg...   

   seller offerType   price   abtest vehicleType  yearOfRegistration  \
0  privat   Angebot  $5,000  control         bus                2004   
1  privat   Angebot  $8,500  control   limousine                1997   
2  privat   Angebot  $8,990     test   limousine                2009   
3  privat   Angebot  $4,350  control  kleinwagen                2007   
4  privat   Angebot  $1,350     test       kombi                2003   

     gearbox  powerPS   model   odometer  monthOfRegistration fuelType  \
0    manuell      158  andere 

**Considerations**:
- We have seen that "autos.csv" is encoded in "Latin-1".
- The dataset has 50000 rows and 20 columns (already described above).
- Some columns have null elements in specific rows.
- 5 columns out of 20 contain int64 elements.
- 15 columns out of 20 contain object (string) elements.
- The content is partially in german.

## Conversion of column names from camelcase to snakecase

In order to work with snakecase instead of camelcase, we need to do some modifications:

In [4]:
autos.columns

Index(['dateCrawled', 'name', 'seller', 'offerType', 'price', 'abtest',
       'vehicleType', 'yearOfRegistration', 'gearbox', 'powerPS', 'model',
       'odometer', 'monthOfRegistration', 'fuelType', 'brand',
       'notRepairedDamage', 'dateCreated', 'nrOfPictures', 'postalCode',
       'lastSeen'],
      dtype='object')

In [5]:
new_col = ['date_crawled', 'name', 'seller', 'offer_type', 'price', 'abtest',
       'vehicle_type', 'registration_year', 'gearbox', 'power_ps', 'model',
       'odometer', 'registration_month', 'fuel_type', 'brand',
       'unrepaired_damage', 'ad_created', 'nr_of_pictures', 'postal_code',
       'last_seen']

#Modifying column names
autos.columns = new_col

#Checking result
autos.head()

Unnamed: 0,date_crawled,name,seller,offer_type,price,abtest,vehicle_type,registration_year,gearbox,power_ps,model,odometer,registration_month,fuel_type,brand,unrepaired_damage,ad_created,nr_of_pictures,postal_code,last_seen
0,2016-03-26 17:47:46,Peugeot_807_160_NAVTECH_ON_BOARD,privat,Angebot,"$5,000",control,bus,2004,manuell,158,andere,"150,000km",3,lpg,peugeot,nein,2016-03-26 00:00:00,0,79588,2016-04-06 06:45:54
1,2016-04-04 13:38:56,BMW_740i_4_4_Liter_HAMANN_UMBAU_Mega_Optik,privat,Angebot,"$8,500",control,limousine,1997,automatik,286,7er,"150,000km",6,benzin,bmw,nein,2016-04-04 00:00:00,0,71034,2016-04-06 14:45:08
2,2016-03-26 18:57:24,Volkswagen_Golf_1.6_United,privat,Angebot,"$8,990",test,limousine,2009,manuell,102,golf,"70,000km",7,benzin,volkswagen,nein,2016-03-26 00:00:00,0,35394,2016-04-06 20:15:37
3,2016-03-12 16:58:10,Smart_smart_fortwo_coupe_softouch/F1/Klima/Pan...,privat,Angebot,"$4,350",control,kleinwagen,2007,automatik,71,fortwo,"70,000km",6,benzin,smart,nein,2016-03-12 00:00:00,0,33729,2016-03-15 03:16:28
4,2016-04-01 14:38:50,Ford_Focus_1_6_Benzin_TÜV_neu_ist_sehr_gepfleg...,privat,Angebot,"$1,350",test,kombi,2003,manuell,0,focus,"150,000km",7,benzin,ford,nein,2016-04-01 00:00:00,0,39218,2016-04-01 14:38:50


As you can see, the column names have been converted to snakecase (snake_case). It is prefered this nomenclature in Python rather than the camelcase (camelCase).

For that purpose, we:
- first checked the column names with "autos.columns".
- then we copied the list and assigned it to a new variable.
- we did the needed modifications.
- finally, we assigned back the column names using the created variable.

## Data converting and cleaning

Now it's time to check the content of the dataset, looking for modifications needed as:
- Column where all or almost all values are the same --> column dropped
- Numeric data stored as string --> clean and convert to number

In [6]:
autos.describe(include="all")

Unnamed: 0,date_crawled,name,seller,offer_type,price,abtest,vehicle_type,registration_year,gearbox,power_ps,model,odometer,registration_month,fuel_type,brand,unrepaired_damage,ad_created,nr_of_pictures,postal_code,last_seen
count,50000,50000,50000,50000,50000,50000,44905,50000.0,47320,50000.0,47242,50000,50000.0,45518,50000,40171,50000,50000.0,50000.0,50000
unique,48213,38754,2,2,2357,2,8,,2,,245,13,,7,40,2,76,,,39481
top,2016-04-02 11:37:04,Ford_Fiesta,privat,Angebot,$0,test,limousine,,manuell,,golf,"150,000km",,benzin,volkswagen,nein,2016-04-03 00:00:00,,,2016-04-07 06:17:27
freq,3,78,49999,49999,1421,25756,12859,,36993,,4024,32424,,30107,10687,35232,1946,,,8
mean,,,,,,,,2005.07328,,116.35592,,,5.72336,,,,,0.0,50813.6273,
std,,,,,,,,105.712813,,209.216627,,,3.711984,,,,,0.0,25779.747957,
min,,,,,,,,1000.0,,0.0,,,0.0,,,,,0.0,1067.0,
25%,,,,,,,,1999.0,,70.0,,,3.0,,,,,0.0,30451.0,
50%,,,,,,,,2003.0,,105.0,,,6.0,,,,,0.0,49577.0,
75%,,,,,,,,2008.0,,150.0,,,9.0,,,,,0.0,71540.0,


After taking a look on the statistics we can see:
- there are some columns with dates inside stored as string. They could be changed to datetime.
- the "seller" column has 2 different values and 49999 times (over 50000) it is "privat".
- the "offer_type" column has 2 different values and 49999 times (over 50000) it is "Angebot".
- the "price" column is stored as string as it has the "$" symbol.
- "abtest" has only two values and it's stored as string.
- there are some missing values in "vehicle_type", "gearbox", "model", "fuel_type" and "unrepaired_damage". None of them is above 20%.
- "registration_year" is already stored as number. It can be changed to datetime.
- there are 7 types of "fuel_type". Perhaps one type has more than one name.
- there are 40 brands in "brand". Perhaps one brand has more than one name.
- the "odometer" column is stored as string. Can be converted to number of km.
- "registration_month" is stored as integer. It can be merged with "registration_year" in a datetime object.
- "nr_of_pictures" is 0 all the time.
- "postal_code" is stored as integer. It's OK although no numeric operations apply.

### Removing columns

After the review, 3 columns can be removed:
- "seller": 49999/50000 with "privat" value
- "offer_type": 49999/50000 with "Angebot" value
- "nr_of_pictures": always 0

In [7]:
autos = autos.drop(["seller","offer_type","nr_of_pictures"], axis=1)

### Converting string to numbers

Let's clean and convert "price" and "odometer" columns to numbers:

In [8]:
autos["price"].head()

0    $5,000
1    $8,500
2    $8,990
3    $4,350
4    $1,350
Name: price, dtype: object

In [9]:
autos["odometer"].head()

0    150,000km
1    150,000km
2     70,000km
3     70,000km
4    150,000km
Name: odometer, dtype: object

In case of "price", the dollar symbol must be removed.

In case of "odometer", the "km" sufix must be removed. Moreover, the name of the column can be changed to "odometer_km" to add removed information.

In [10]:
autos["price"] = autos["price"].str.replace("$","").str.replace(",","").astype(int)
autos["price"].head()

0    5000
1    8500
2    8990
3    4350
4    1350
Name: price, dtype: int32

In [11]:
autos["odometer"] = autos["odometer"].str.replace("km","").str.replace(",","").astype(int)
autos.rename(columns={"odometer": "odometer_km"},inplace=True)
autos["odometer_km"].head()

0    150000
1    150000
2     70000
3     70000
4    150000
Name: odometer_km, dtype: int32

### Cleaning values from "price" and "odometer_km"

Let's now check unrealistic values in these 2 columns, starting from "price":

In [12]:
print("Number of unique price values:", autos["price"].unique().shape[0], "\n")
print("Price column statistics:\n")
print(autos["price"].describe(), "\n")
print("5 most common prices and times they appear:\n")
print(autos["price"].value_counts().head(), "\n")
print("20 highest prices and times they appear:\n")
print(autos["price"].value_counts().sort_index(ascending=False).head(20), "\n")
print("20 lowest prices and times they appear:\n")
print(autos["price"].value_counts().sort_index(ascending=True).head(20))

Number of unique price values: 2357 

Price column statistics:

count    5.000000e+04
mean     9.840044e+03
std      4.811044e+05
min      0.000000e+00
25%      1.100000e+03
50%      2.950000e+03
75%      7.200000e+03
max      1.000000e+08
Name: price, dtype: float64 

5 most common prices and times they appear:

0       1421
500      781
1500     734
2500     643
1000     639
Name: price, dtype: int64 

20 highest prices and times they appear:

99999999    1
27322222    1
12345678    3
11111111    2
10000000    1
3890000     1
1300000     1
1234566     1
999999      2
999990      1
350000      1
345000      1
299000      1
295000      1
265000      1
259000      1
250000      1
220000      1
198000      1
197000      1
Name: price, dtype: int64 

20 lowest prices and times they appear:

0     1421
1      156
2        3
3        1
5        2
8        1
9        1
10       7
11       2
12       3
13       2
14       1
15       2
17       3
18       1
20       4
25       5
29       1
30 

We can see that there are unrealistic prices. Too low and too high. Therefore, some rows can be removed.
- Taking a look on high prices, an acceptable limit could be 350000$ considering an expensive car.
- Taking a look on low prices, there are 1421 cars with price 0. We can remove them although ebay has the option of bids (so, maybe, 0 is realistic). However, 1421/50000 is a small proportion so we can accept to delete them.

In [13]:
autos = autos[autos["price"].between(1,350000)]

#Checking modifications:
print("Number of unique price values:", autos["price"].unique().shape[0], "\n")
print("Price column statistics:\n")
print(autos["price"].describe(), "\n")

Number of unique price values: 2346 

Price column statistics:

count     48565.000000
mean       5888.935591
std        9059.854754
min           1.000000
25%        1200.000000
50%        3000.000000
75%        7490.000000
max      350000.000000
Name: price, dtype: float64 



Now, let's take a look on "odometer_km" column:

In [14]:
print("Number of unique odometer values:", autos["odometer_km"].unique().shape[0], "\n")
print("Odometer column statistics:\n")
print(autos["odometer_km"].describe(), "\n")
print("5 most common odometer values and times they appear:\n")
print(autos["odometer_km"].value_counts().head(), "\n")
print("5 highest odometer values and times they appear:\n")
print(autos["odometer_km"].value_counts().sort_index(ascending=False).head(), "\n")
print("5 lowest odometer values and times they appear:\n")
print(autos["odometer_km"].value_counts().sort_index(ascending=True).head())

Number of unique odometer values: 13 

Odometer column statistics:

count     48565.000000
mean     125770.101925
std       39788.636804
min        5000.000000
25%      125000.000000
50%      150000.000000
75%      150000.000000
max      150000.000000
Name: odometer_km, dtype: float64 

5 most common odometer values and times they appear:

150000    31414
125000     5057
100000     2115
90000      1734
80000      1415
Name: odometer_km, dtype: int64 

5 highest odometer values and times they appear:

150000    31414
125000     5057
100000     2115
90000      1734
80000      1415
Name: odometer_km, dtype: int64 

5 lowest odometer values and times they appear:

5000     836
10000    253
20000    762
30000    780
40000    815
Name: odometer_km, dtype: int64


In this case, it seems the values are discrete and might be chosen by the seller. We don't see any outlier in this columns.
Values are in the range 5000:150000km.

### Converting strings to dates and times

Now let's take a look on the columns with dates and times to convert the string in datetime objects. These columns are:
- **date_crawled**: added by the crawler (string)
- **last_seen**: added by the crawler (string)
- **ad_created**: from the website (string)
- **registration_month**: from the website (number)
- **registration_year**: from the website (number)

Let's check the distribution of the 3 columns with dates in string:

In [15]:
#date_crawled
autos["date_crawled"].str[:10].value_counts(normalize=True, dropna=False).sort_index()

2016-03-05    0.025327
2016-03-06    0.014043
2016-03-07    0.036014
2016-03-08    0.033296
2016-03-09    0.033090
2016-03-10    0.032184
2016-03-11    0.032575
2016-03-12    0.036920
2016-03-13    0.015670
2016-03-14    0.036549
2016-03-15    0.034284
2016-03-16    0.029610
2016-03-17    0.031628
2016-03-18    0.012911
2016-03-19    0.034778
2016-03-20    0.037887
2016-03-21    0.037373
2016-03-22    0.032987
2016-03-23    0.032225
2016-03-24    0.029342
2016-03-25    0.031607
2016-03-26    0.032204
2016-03-27    0.031092
2016-03-28    0.034860
2016-03-29    0.034099
2016-03-30    0.033687
2016-03-31    0.031834
2016-04-01    0.033687
2016-04-02    0.035478
2016-04-03    0.038608
2016-04-04    0.036487
2016-04-05    0.013096
2016-04-06    0.003171
2016-04-07    0.001400
Name: date_crawled, dtype: float64

`Observations`:

The dates in the column include around one month, from the beginning of March to beginning of April. It seems that the date are well distributed, with no value specially above the others. The last two days seems to have less proportion.

In [16]:
#last_seen
autos["last_seen"].str[:10].value_counts(normalize=True, dropna=False).sort_index()

2016-03-05    0.001071
2016-03-06    0.004324
2016-03-07    0.005395
2016-03-08    0.007413
2016-03-09    0.009595
2016-03-10    0.010666
2016-03-11    0.012375
2016-03-12    0.023783
2016-03-13    0.008895
2016-03-14    0.012602
2016-03-15    0.015876
2016-03-16    0.016452
2016-03-17    0.028086
2016-03-18    0.007351
2016-03-19    0.015834
2016-03-20    0.020653
2016-03-21    0.020632
2016-03-22    0.021373
2016-03-23    0.018532
2016-03-24    0.019767
2016-03-25    0.019211
2016-03-26    0.016802
2016-03-27    0.015649
2016-03-28    0.020859
2016-03-29    0.022341
2016-03-30    0.024771
2016-03-31    0.023783
2016-04-01    0.022794
2016-04-02    0.024915
2016-04-03    0.025203
2016-04-04    0.024483
2016-04-05    0.124761
2016-04-06    0.221806
2016-04-07    0.131947
Name: last_seen, dtype: float64

`Observations`:

The dates in the column include around one month, from the beginning of March to beginning of April. It seems that the date are well distributed, having the highest values in the latest dates and the lowest in the earliest ones. Makes sense considering it is a "last_seen" columns.

In [17]:
#ad_created
autos["ad_created"].str[:10].value_counts(normalize=True, dropna=False).sort_index()

2015-06-11    0.000021
2015-08-10    0.000021
2015-09-09    0.000021
2015-11-10    0.000021
2015-12-05    0.000021
2015-12-30    0.000021
2016-01-03    0.000021
2016-01-07    0.000021
2016-01-10    0.000041
2016-01-13    0.000021
2016-01-14    0.000021
2016-01-16    0.000021
2016-01-22    0.000021
2016-01-27    0.000062
2016-01-29    0.000021
2016-02-01    0.000021
2016-02-02    0.000041
2016-02-05    0.000041
2016-02-07    0.000021
2016-02-08    0.000021
2016-02-09    0.000021
2016-02-11    0.000021
2016-02-12    0.000041
2016-02-14    0.000041
2016-02-16    0.000021
2016-02-17    0.000021
2016-02-18    0.000041
2016-02-19    0.000062
2016-02-20    0.000041
2016-02-21    0.000062
                ...   
2016-03-09    0.033151
2016-03-10    0.031895
2016-03-11    0.032904
2016-03-12    0.036755
2016-03-13    0.017008
2016-03-14    0.035190
2016-03-15    0.034016
2016-03-16    0.030125
2016-03-17    0.031278
2016-03-18    0.013590
2016-03-19    0.033687
2016-03-20    0.037949
2016-03-21 

`Observations`:

This column include more dates than the previous ones, starting from June 2015 and finishing at the same date as the other columns (April 2016). As it reflects when the ad was created, the range is wider than the other columns.

Let's take look now at the "registration_year" column to understand the distribution:

In [18]:
autos["registration_year"].describe()

count    48565.000000
mean      2004.755421
std         88.643887
min       1000.000000
25%       1999.000000
50%       2004.000000
75%       2008.000000
max       9999.000000
Name: registration_year, dtype: float64

We can see some wrong values, as the min. year is 1000 and the max. year is 9999. This column will need to be cleaned. Moreover, we can see that there are some null values.

We will remove the rows where the "registration_year" is out of the range [1900:2016]:
- 1900: to consider the first decades of the 1900'
- 2016: as it is the current year

In [19]:
autos = autos[autos["registration_year"].between(1900,2016)]

In [20]:
autos["registration_year"].value_counts(normalize=True, dropna=False).head(15)

2000    0.067608
2005    0.062895
1999    0.062060
2004    0.057904
2003    0.057818
2006    0.057197
2001    0.056468
2002    0.053255
1998    0.050620
2007    0.048778
2008    0.047450
2009    0.044665
1997    0.041794
2011    0.034768
2010    0.034040
Name: registration_year, dtype: float64

After taking a look at the registration year, we can see that the major part of the cars are registered in the latest 90's and earliest 2000.

## Using agreggation

We are going to analyse the "brand" column using agreggation. Let's start checking its content:

In [21]:
autos["brand"].unique()

array(['peugeot', 'bmw', 'volkswagen', 'smart', 'ford', 'chrysler',
       'seat', 'renault', 'mercedes_benz', 'audi', 'sonstige_autos',
       'opel', 'mazda', 'porsche', 'mini', 'toyota', 'dacia', 'nissan',
       'jeep', 'saab', 'volvo', 'mitsubishi', 'jaguar', 'fiat', 'skoda',
       'subaru', 'kia', 'citroen', 'chevrolet', 'hyundai', 'honda',
       'daewoo', 'suzuki', 'trabant', 'land_rover', 'alfa_romeo', 'lada',
       'rover', 'daihatsu', 'lancia'], dtype=object)

We can see there are no repeated brands with different names (using abbreviations or miswriting). Let's check the top 10 brands to work with:

In [22]:
top_brands = autos["brand"].value_counts().head(10).index
print(top_brands)

Index(['volkswagen', 'bmw', 'opel', 'mercedes_benz', 'audi', 'ford', 'renault',
       'peugeot', 'fiat', 'seat'],
      dtype='object')


From each of these brands, we will create a dictionary with its mean price:

In [23]:
brands_dict = {}

for brand in top_brands:
    brand_mean = autos[autos["brand"] == brand]["price"].mean()
    brands_dict[brand] = brand_mean
    
print(brands_dict,"\n")

for key, value in brands_dict.items():
    print(key,":",value)

{'volkswagen': 5402.410261610221, 'bmw': 8332.820517811953, 'opel': 2975.2419354838707, 'mercedes_benz': 8628.450366422385, 'audi': 9336.687453600594, 'ford': 3749.4695065890287, 'renault': 2474.8646069968195, 'peugeot': 3094.0172290021537, 'fiat': 2813.748538011696, 'seat': 4397.230949589683} 

volkswagen : 5402.410261610221
bmw : 8332.820517811953
opel : 2975.2419354838707
mercedes_benz : 8628.450366422385
audi : 9336.687453600594
ford : 3749.4695065890287
renault : 2474.8646069968195
peugeot : 3094.0172290021537
fiat : 2813.748538011696
seat : 4397.230949589683


From the data we can see that:
- BMW, Mercedes and Audi are, in this order, the more expensive top brands.
- Volkswagen, Ford and SEAT are, in this order, the medium price top brands.
- Opel, Renault, Peugeos and Fiat are, in this order, the cheaper top brands.

## Converting dictionaries to series and dataframes

We are going to use the top 6 brands to understand the average mileage and look for a relationship with the price.

From a dictionary, we can create a Serie object with the constructor. The index names will come from the dictionary keys. Additionaly, we can convert a Serie object to Dataframe object and add column label.

We will create 2 dictionaries:
- Mean price for the top 6 brands
- Mean mileage for the top 6 brands

In [29]:
top_6_brands = autos["brand"].value_counts().head(6).index

top_6_price_mean = {}
top_6_mileage_mean = {}

for brand in top_6_brands:
    selected_brand = autos[autos["brand"] == brand]
    top_6_price_mean[brand] = int(selected_brand["price"].mean())
    top_6_mileage_mean[brand] = int(selected_brand["odometer_km"].mean())
    
print(top_6_price_mean)
print(top_6_mileage_mean)

{'volkswagen': 5402, 'bmw': 8332, 'opel': 2975, 'mercedes_benz': 8628, 'audi': 9336, 'ford': 3749}
{'volkswagen': 128707, 'bmw': 132572, 'opel': 129310, 'mercedes_benz': 130788, 'audi': 129157, 'ford': 124266}


Now we can convert both doctionaries to Series objects and merge them together in a Dataframe object:

In [35]:
mean_price = pd.Series(top_6_price_mean)
mean_mileage = pd.Series(top_6_mileage_mean)

mean_df = pd.DataFrame(mean_price, columns=["mean_price"])
mean_df["mean_mileage"] = mean_mileage

print(mean_df)

               mean_price  mean_mileage
volkswagen           5402        128707
bmw                  8332        132572
opel                 2975        129310
mercedes_benz        8628        130788
audi                 9336        129157
ford                 3749        124266


The mean mileage is quite similar in all the cases while the prices are different. This might come from the brand value: considering the same amount of km, a BMW will be more expensive than a SEAT.