# Ebay Car Listings project 

This project is dedicated towards investigating a dataset with car listings from ebay. Throughout this work we will get familiar with the data, clean and prep it for the following actions.

The data dictionary provided with data is as follows:

  *  dateCrawled - When this ad was first crawled. All field-values are taken        from this date.
  *  name - Name of the car.
  *  seller - Whether the seller is private or a dealer.
  *  offerType - The type of listing
  *  price - The price on the ad to sell the car.
  *  abtest - Whether the listing is included in an A/B test.
  *  vehicleType - The vehicle Type.
  *  yearOfRegistration - The year in which which year the car was first              registered.
  *  gearbox - The transmission type.
  *  powerPS - The power of the car in PS.
  *  model - The car model name.
  *  kilometer - How many kilometers the car has driven.
  *  monthOfRegistration - The month in which which year the car was first            registered.
  *  fuelType - What type of fuel the car uses.
  *  brand - The brand of the car.
  *  notRepairedDamage - If the car has a damage which is not yet repaired.
  *  dateCreated - The date on which the eBay listing was created.
  *  nrOfPictures - The number of pictures in the ad.
  *  postalCode - The postal code for the location of the vehicle.
  *  lastSeenOnline - When the crawler saw this ad last online.

## Opening and Reading the data

In [106]:
import pandas as pd
import numpy as np

In [107]:
autos = pd.read_csv('autos.csv',encoding = 'latin1')
autos.head()

Unnamed: 0,dateCrawled,name,seller,offerType,price,abtest,vehicleType,yearOfRegistration,gearbox,powerPS,model,odometer,monthOfRegistration,fuelType,brand,notRepairedDamage,dateCreated,nrOfPictures,postalCode,lastSeen
0,2016-03-26 17:47:46,Peugeot_807_160_NAVTECH_ON_BOARD,privat,Angebot,"$5,000",control,bus,2004,manuell,158,andere,"150,000km",3,lpg,peugeot,nein,2016-03-26 00:00:00,0,79588,2016-04-06 06:45:54
1,2016-04-04 13:38:56,BMW_740i_4_4_Liter_HAMANN_UMBAU_Mega_Optik,privat,Angebot,"$8,500",control,limousine,1997,automatik,286,7er,"150,000km",6,benzin,bmw,nein,2016-04-04 00:00:00,0,71034,2016-04-06 14:45:08
2,2016-03-26 18:57:24,Volkswagen_Golf_1.6_United,privat,Angebot,"$8,990",test,limousine,2009,manuell,102,golf,"70,000km",7,benzin,volkswagen,nein,2016-03-26 00:00:00,0,35394,2016-04-06 20:15:37
3,2016-03-12 16:58:10,Smart_smart_fortwo_coupe_softouch/F1/Klima/Pan...,privat,Angebot,"$4,350",control,kleinwagen,2007,automatik,71,fortwo,"70,000km",6,benzin,smart,nein,2016-03-12 00:00:00,0,33729,2016-03-15 03:16:28
4,2016-04-01 14:38:50,Ford_Focus_1_6_Benzin_TÜV_neu_ist_sehr_gepfleg...,privat,Angebot,"$1,350",test,kombi,2003,manuell,0,focus,"150,000km",7,benzin,ford,nein,2016-04-01 00:00:00,0,39218,2016-04-01 14:38:50


In [108]:
autos.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 50000 entries, 0 to 49999
Data columns (total 20 columns):
dateCrawled            50000 non-null object
name                   50000 non-null object
seller                 50000 non-null object
offerType              50000 non-null object
price                  50000 non-null object
abtest                 50000 non-null object
vehicleType            44905 non-null object
yearOfRegistration     50000 non-null int64
gearbox                47320 non-null object
powerPS                50000 non-null int64
model                  47242 non-null object
odometer               50000 non-null object
monthOfRegistration    50000 non-null int64
fuelType               45518 non-null object
brand                  50000 non-null object
notRepairedDamage      40171 non-null object
dateCreated            50000 non-null object
nrOfPictures           50000 non-null int64
postalCode             50000 non-null int64
lastSeen               50000 non-null obj

There are 20 columns in our dataset, most of which are of the string type. We can make the column names a bit easier to read.

## Cleaning Column Names

In [109]:
autos.columns

Index(['dateCrawled', 'name', 'seller', 'offerType', 'price', 'abtest',
       'vehicleType', 'yearOfRegistration', 'gearbox', 'powerPS', 'model',
       'odometer', 'monthOfRegistration', 'fuelType', 'brand',
       'notRepairedDamage', 'dateCreated', 'nrOfPictures', 'postalCode',
       'lastSeen'],
      dtype='object')

In [110]:
autos.columns = [
       'date_crawled', 'name', 'seller', 'offer_type', 'price', 'abtest',
       'vehicle_type', 'registration_year', 'gearbox', 'power_ps', 'model',
       'odometer', 'registration_month', 'fuel_type', 'brand',
       'unrepaired_damage', 'ad_created', 'nr_of_pictures', 'postal_code',
       'last_seen'
]

In [111]:
autos.head()

Unnamed: 0,date_crawled,name,seller,offer_type,price,abtest,vehicle_type,registration_year,gearbox,power_ps,model,odometer,registration_month,fuel_type,brand,unrepaired_damage,ad_created,nr_of_pictures,postal_code,last_seen
0,2016-03-26 17:47:46,Peugeot_807_160_NAVTECH_ON_BOARD,privat,Angebot,"$5,000",control,bus,2004,manuell,158,andere,"150,000km",3,lpg,peugeot,nein,2016-03-26 00:00:00,0,79588,2016-04-06 06:45:54
1,2016-04-04 13:38:56,BMW_740i_4_4_Liter_HAMANN_UMBAU_Mega_Optik,privat,Angebot,"$8,500",control,limousine,1997,automatik,286,7er,"150,000km",6,benzin,bmw,nein,2016-04-04 00:00:00,0,71034,2016-04-06 14:45:08
2,2016-03-26 18:57:24,Volkswagen_Golf_1.6_United,privat,Angebot,"$8,990",test,limousine,2009,manuell,102,golf,"70,000km",7,benzin,volkswagen,nein,2016-03-26 00:00:00,0,35394,2016-04-06 20:15:37
3,2016-03-12 16:58:10,Smart_smart_fortwo_coupe_softouch/F1/Klima/Pan...,privat,Angebot,"$4,350",control,kleinwagen,2007,automatik,71,fortwo,"70,000km",6,benzin,smart,nein,2016-03-12 00:00:00,0,33729,2016-03-15 03:16:28
4,2016-04-01 14:38:50,Ford_Focus_1_6_Benzin_TÜV_neu_ist_sehr_gepfleg...,privat,Angebot,"$1,350",test,kombi,2003,manuell,0,focus,"150,000km",7,benzin,ford,nein,2016-04-01 00:00:00,0,39218,2016-04-01 14:38:50


## Initial Exploration and Cleaning

In [112]:
autos.describe(include='all')

Unnamed: 0,date_crawled,name,seller,offer_type,price,abtest,vehicle_type,registration_year,gearbox,power_ps,model,odometer,registration_month,fuel_type,brand,unrepaired_damage,ad_created,nr_of_pictures,postal_code,last_seen
count,50000,50000,50000,50000,50000,50000,44905,50000.0,47320,50000.0,47242,50000,50000.0,45518,50000,40171,50000,50000.0,50000.0,50000
unique,48213,38754,2,2,2357,2,8,,2,,245,13,,7,40,2,76,,,39481
top,2016-03-27 22:55:05,Ford_Fiesta,privat,Angebot,$0,test,limousine,,manuell,,golf,"150,000km",,benzin,volkswagen,nein,2016-04-03 00:00:00,,,2016-04-07 06:17:27
freq,3,78,49999,49999,1421,25756,12859,,36993,,4024,32424,,30107,10687,35232,1946,,,8
mean,,,,,,,,2005.07328,,116.35592,,,5.72336,,,,,0.0,50813.6273,
std,,,,,,,,105.712813,,209.216627,,,3.711984,,,,,0.0,25779.747957,
min,,,,,,,,1000.0,,0.0,,,0.0,,,,,0.0,1067.0,
25%,,,,,,,,1999.0,,70.0,,,3.0,,,,,0.0,30451.0,
50%,,,,,,,,2003.0,,105.0,,,6.0,,,,,0.0,49577.0,
75%,,,,,,,,2008.0,,150.0,,,9.0,,,,,0.0,71540.0,


Notice how columns **seller** and **offer_type** do not have much information in them, we can drop these columns. **nr_of_pictures** does stand out too.

In [113]:
autos["nr_of_pictures"].value_counts()

0    50000
Name: nr_of_pictures, dtype: int64

This column has 0 in each row, we should drop it as well.

In [114]:
autos = autos.drop(["nr_of_pictures", "seller", "offer_type"], axis=1)

Also, columns **price** and **odometer** have additional string characters in them.

In [115]:
autos['price'] = (autos.price
                        .str.replace('$','')
                        .str.replace(',','')
                        .astype(int) 
                 )
autos['odometer'] = (autos.odometer
                        .str.replace('km','')
                        .str.replace(',','')
                        .astype(int)
                       )

In [116]:
autos["price"].dtype

dtype('int64')

We can rename the **odometer** column to **odometer_km** so it would be clear what units we have in this column

In [117]:
autos = autos.rename({'odometer':'odometer_km'},axis = 1)

## Exploring the Odometer and Price Columns

In [118]:
autos["odometer_km"].value_counts()

150000    32424
125000     5170
100000     2169
90000      1757
80000      1436
70000      1230
60000      1164
50000      1027
5000        967
40000       819
30000       789
20000       784
10000       264
Name: odometer_km, dtype: int64

In [119]:
autos[['price','odometer_km']].describe()

Unnamed: 0,price,odometer_km
count,50000.0,50000.0
mean,9840.044,125732.7
std,481104.4,40042.211706
min,0.0,5000.0
25%,1100.0,125000.0
50%,2950.0,150000.0
75%,7200.0,150000.0
max,100000000.0,150000.0


It seems that sellers tend to round the odometer numbers, and that most of the vehicles for sale have high mileages. Let's take a look at the **price** column now.

In [120]:
autos.price.value_counts().sort_index()

0           1421
1            156
2              3
3              1
5              2
8              1
9              1
10             7
11             2
12             3
13             2
14             1
15             2
17             3
18             1
20             4
25             5
29             1
30             7
35             1
40             6
45             4
47             1
49             4
50            49
55             2
59             1
60             9
65             5
66             1
            ... 
151990         1
155000         1
163500         1
163991         1
169000         1
169999         1
175000         1
180000         1
190000         1
194000         1
197000         1
198000         1
220000         1
250000         1
259000         1
265000         1
295000         1
299000         1
345000         1
350000         1
999990         1
999999         2
1234566        1
1300000        1
3890000        1
10000000       1
11111111       2
12345678      

Numver here are rounded too, however 1421 cars are listed for free (0\$), which is quite weird as well as the most expensive car for 99999999\$. We can remove these. In fact, we can remove a few more listings up to 999999\$. 

Let's keep the cars priced from 1$ since eBay has auctions where the starting price might be one dollar.

In [121]:
autos = autos[autos["price"].between(1,1000000)]

In [122]:
autos.price.value_counts().sort_index()

1         156
2           3
3           1
5           2
8           1
9           1
10          7
11          2
12          3
13          2
14          1
15          2
17          3
18          1
20          4
25          5
29          1
30          7
35          1
40          6
45          4
47          1
49          4
50         49
55          2
59          1
60          9
65          5
66          1
70         10
         ... 
120000      2
128000      1
129000      1
130000      1
135000      1
137999      1
139997      1
145000      1
151990      1
155000      1
163500      1
163991      1
169000      1
169999      1
175000      1
180000      1
190000      1
194000      1
197000      1
198000      1
220000      1
250000      1
259000      1
265000      1
295000      1
299000      1
345000      1
350000      1
999990      1
999999      2
Name: price, Length: 2348, dtype: int64

In [123]:
autos["price"].describe()

count     48568.000000
mean       5950.340656
std       11963.134750
min           1.000000
25%        1200.000000
50%        3000.000000
75%        7490.000000
max      999999.000000
Name: price, dtype: float64

## Exploring the date columns

We have a few columns with dates stated in them:
*  date_crawled
*  registration_month
*  registration_year
*  ad_created
*  last_seen

We should explore them more.

In [124]:
autos[['date_crawled','ad_created','last_seen']]

Unnamed: 0,date_crawled,ad_created,last_seen
0,2016-03-26 17:47:46,2016-03-26 00:00:00,2016-04-06 06:45:54
1,2016-04-04 13:38:56,2016-04-04 00:00:00,2016-04-06 14:45:08
2,2016-03-26 18:57:24,2016-03-26 00:00:00,2016-04-06 20:15:37
3,2016-03-12 16:58:10,2016-03-12 00:00:00,2016-03-15 03:16:28
4,2016-04-01 14:38:50,2016-04-01 00:00:00,2016-04-01 14:38:50
5,2016-03-21 13:47:45,2016-03-21 00:00:00,2016-04-06 09:45:21
6,2016-03-20 17:55:21,2016-03-20 00:00:00,2016-03-23 02:48:59
7,2016-03-16 18:55:19,2016-03-16 00:00:00,2016-04-07 03:17:32
8,2016-03-22 16:51:34,2016-03-22 00:00:00,2016-03-26 18:18:10
9,2016-03-16 13:47:02,2016-03-16 00:00:00,2016-04-06 10:46:35


In [125]:
autos['date_crawled'].str[:10].value_counts(normalize=True, dropna=False).sort_index()

2016-03-05    0.025325
2016-03-06    0.014042
2016-03-07    0.036011
2016-03-08    0.033294
2016-03-09    0.033088
2016-03-10    0.032182
2016-03-11    0.032573
2016-03-12    0.036917
2016-03-13    0.015669
2016-03-14    0.036547
2016-03-15    0.034282
2016-03-16    0.029608
2016-03-17    0.031646
2016-03-18    0.012910
2016-03-19    0.034776
2016-03-20    0.037885
2016-03-21    0.037391
2016-03-22    0.032985
2016-03-23    0.032223
2016-03-24    0.029340
2016-03-25    0.031605
2016-03-26    0.032202
2016-03-27    0.031090
2016-03-28    0.034858
2016-03-29    0.034117
2016-03-30    0.033685
2016-03-31    0.031832
2016-04-01    0.033685
2016-04-02    0.035476
2016-04-03    0.038606
2016-04-04    0.036485
2016-04-05    0.013095
2016-04-06    0.003171
2016-04-07    0.001400
Name: date_crawled, dtype: float64

We see that the crawl dates fall in March-April 2016.

In [126]:
autos['ad_created'].str[:10].value_counts(normalize=True, dropna=False).sort_index()

2015-06-11    0.000021
2015-08-10    0.000021
2015-09-09    0.000021
2015-11-10    0.000021
2015-12-05    0.000021
2015-12-30    0.000021
2016-01-03    0.000021
2016-01-07    0.000021
2016-01-10    0.000041
2016-01-13    0.000021
2016-01-14    0.000021
2016-01-16    0.000021
2016-01-22    0.000021
2016-01-27    0.000062
2016-01-29    0.000021
2016-02-01    0.000021
2016-02-02    0.000041
2016-02-05    0.000041
2016-02-07    0.000021
2016-02-08    0.000021
2016-02-09    0.000021
2016-02-11    0.000021
2016-02-12    0.000041
2016-02-14    0.000041
2016-02-16    0.000021
2016-02-17    0.000021
2016-02-18    0.000041
2016-02-19    0.000062
2016-02-20    0.000041
2016-02-21    0.000062
                ...   
2016-03-09    0.033149
2016-03-10    0.031893
2016-03-11    0.032902
2016-03-12    0.036753
2016-03-13    0.017007
2016-03-14    0.035188
2016-03-15    0.034014
2016-03-16    0.030123
2016-03-17    0.031296
2016-03-18    0.013589
2016-03-19    0.033685
2016-03-20    0.037947
2016-03-21 

In [127]:
autos['last_seen'].str[:10].value_counts(normalize=True, dropna=False).sort_index()

2016-03-05    0.001071
2016-03-06    0.004324
2016-03-07    0.005394
2016-03-08    0.007412
2016-03-09    0.009595
2016-03-10    0.010665
2016-03-11    0.012374
2016-03-12    0.023781
2016-03-13    0.008895
2016-03-14    0.012601
2016-03-15    0.015875
2016-03-16    0.016451
2016-03-17    0.028084
2016-03-18    0.007351
2016-03-19    0.015833
2016-03-20    0.020651
2016-03-21    0.020631
2016-03-22    0.021372
2016-03-23    0.018531
2016-03-24    0.019766
2016-03-25    0.019210
2016-03-26    0.016801
2016-03-27    0.015648
2016-03-28    0.020878
2016-03-29    0.022360
2016-03-30    0.024769
2016-03-31    0.023781
2016-04-01    0.022793
2016-04-02    0.024914
2016-04-03    0.025202
2016-04-04    0.024481
2016-04-05    0.124753
2016-04-06    0.221813
2016-04-07    0.131939
Name: last_seen, dtype: float64

In [128]:
autos["last_seen"].str[:10].value_counts(normalize=True, dropna=False).sort_values()

2016-03-05    0.001071
2016-03-06    0.004324
2016-03-07    0.005394
2016-03-18    0.007351
2016-03-08    0.007412
2016-03-13    0.008895
2016-03-09    0.009595
2016-03-10    0.010665
2016-03-11    0.012374
2016-03-14    0.012601
2016-03-27    0.015648
2016-03-19    0.015833
2016-03-15    0.015875
2016-03-16    0.016451
2016-03-26    0.016801
2016-03-23    0.018531
2016-03-25    0.019210
2016-03-24    0.019766
2016-03-21    0.020631
2016-03-20    0.020651
2016-03-28    0.020878
2016-03-22    0.021372
2016-03-29    0.022360
2016-04-01    0.022793
2016-03-31    0.023781
2016-03-12    0.023781
2016-04-04    0.024481
2016-03-30    0.024769
2016-04-02    0.024914
2016-04-03    0.025202
2016-03-17    0.028084
2016-04-05    0.124753
2016-04-07    0.131939
2016-04-06    0.221813
Name: last_seen, dtype: float64

The "last seen" term here might mean that the car was sold or the crawling period ended. The last three days show huge numbers comparing to the preious ones, and more likely that these values are to do with the crawling period ending and don't indicate car sales.

## Dealing with Incorrect Registration Year Data

In [129]:
autos["registration_year"].describe()

count    48568.000000
mean      2004.754612
std         88.641262
min       1000.000000
25%       1999.000000
50%       2004.000000
75%       2008.000000
max       9999.000000
Name: registration_year, dtype: float64

Registration year most likely means the release year of the car, so we can remove some irrelevant values such as the minimum value of year 1000 and the max value of 9999. We can stick the time period of 1901 (first mass produced car)  to 2016(according to our listing dates).

In [130]:
autos = autos[autos.registration_year.between(1901,2016)]

In [131]:
autos["registration_year"].value_counts()

2000    3156
2005    2936
1999    2897
2004    2703
2003    2699
2006    2670
2001    2636
2002    2486
1998    2363
2007    2277
2008    2215
2009    2086
1997    1951
2011    1623
2010    1589
1996    1373
2012    1310
1995    1227
2016    1220
2013     803
2014     663
1994     629
1993     425
2015     392
1992     370
1990     347
1991     339
1989     174
1988     135
1985      96
        ... 
1966      22
1976      21
1969      19
1975      18
1965      17
1964      12
1963       8
1959       6
1961       6
1910       5
1956       4
1958       4
1937       4
1962       4
1950       3
1954       2
1941       2
1951       2
1934       2
1957       2
1955       2
1953       1
1943       1
1929       1
1939       1
1938       1
1948       1
1927       1
1931       1
1952       1
Name: registration_year, Length: 78, dtype: int64

In [132]:
autos["registration_year"].describe()

count    46684.000000
mean      2002.910033
std          7.186122
min       1910.000000
25%       1999.000000
50%       2003.000000
75%       2008.000000
max       2016.000000
Name: registration_year, dtype: float64

## Exploring Price by Brand

In [133]:
autos["brand"].unique()

array(['peugeot', 'bmw', 'volkswagen', 'smart', 'ford', 'chrysler',
       'seat', 'renault', 'mercedes_benz', 'audi', 'sonstige_autos',
       'opel', 'mazda', 'porsche', 'mini', 'toyota', 'dacia', 'nissan',
       'jeep', 'saab', 'volvo', 'mitsubishi', 'jaguar', 'fiat', 'skoda',
       'subaru', 'kia', 'citroen', 'chevrolet', 'hyundai', 'honda',
       'daewoo', 'suzuki', 'trabant', 'land_rover', 'alfa_romeo', 'lada',
       'rover', 'daihatsu', 'lancia'], dtype=object)

In [134]:
autos.brand.value_counts(normalize=True).sort_values(ascending=False)

volkswagen        0.211293
bmw               0.110038
opel              0.107574
mercedes_benz     0.096457
audi              0.086561
ford              0.069917
renault           0.047147
peugeot           0.029839
fiat              0.025640
seat              0.018272
skoda             0.016408
nissan            0.015273
mazda             0.015187
smart             0.014159
citroen           0.014009
toyota            0.012702
hyundai           0.010025
sonstige_autos    0.009811
volvo             0.009147
mini              0.008761
mitsubishi        0.008226
honda             0.007840
kia               0.007069
alfa_romeo        0.006640
porsche           0.006126
suzuki            0.005934
chevrolet         0.005698
chrysler          0.003513
dacia             0.002635
daihatsu          0.002506
jeep              0.002271
subaru            0.002142
land_rover        0.002099
saab              0.001649
jaguar            0.001564
daewoo            0.001499
trabant           0.001392
r

Noticeble that german brands are of great popularity and volkwagen is the leader among them with around doulbe the value of the next brand BMW. Let's make a list of top 6 popular brands and their mean prices from our dataset. 

In [135]:
top_6 = autos.brand.value_counts(normalize=True).index[:6]
brand_prices = {}
for brand in top_6:
    group = autos[autos.brand == brand]
    brand_prices[brand] = int(group.price.mean())
brand_prices

{'audi': 9336,
 'bmw': 8332,
 'ford': 4054,
 'mercedes_benz': 8628,
 'opel': 2975,
 'volkswagen': 5604}

In [136]:
top6_series = pd.Series(brand_prices)
print(top6_series.sort_values(ascending=False))

audi             9336
mercedes_benz    8628
bmw              8332
volkswagen       5604
ford             4054
opel             2975
dtype: int64


In [137]:
df_price = pd.DataFrame(top6_series.sort_values(ascending=False), columns = ["mean_price"])
df_price

Unnamed: 0,mean_price
audi,9336
mercedes_benz,8628
bmw,8332
volkswagen,5604
ford,4054
opel,2975


Audi, Mercedes and BMW are the most expensive, Volkswagen is cheaper, Ford and Opel are the cheapest here. Probably, the reason of Volkswagen's popularity is it's position in the middle, might be a good compromise.

## Exploring Mileage by Brand

We can repeat the same operation for the mileage and create a top 6 list.

In [138]:
autos["odometer_km"].value_counts()

150000    30087
125000     4858
100000     2058
90000      1673
80000      1375
70000      1187
60000      1128
50000       993
40000       797
5000        785
30000       760
20000       742
10000       241
Name: odometer_km, dtype: int64

In [139]:
top_6_mileage = autos.odometer_km.value_counts(normalize=True).index[:6]
brand_mileage = {}
for brand in top_6:
    group = autos[autos.brand == brand]
    brand_mileage[brand] = int(group.odometer_km.mean())
brand_mileage

{'audi': 129157,
 'bmw': 132572,
 'ford': 124266,
 'mercedes_benz': 130788,
 'opel': 129310,
 'volkswagen': 128711}

In [140]:
top6m_series = pd.Series(brand_mileage)
print(top6m_series)

audi             129157
bmw              132572
ford             124266
mercedes_benz    130788
opel             129310
volkswagen       128711
dtype: int64


In [141]:
df_mileage = pd.DataFrame(top6m_series, columns = ["mean_mileage"])
df_mileage

Unnamed: 0,mean_mileage
audi,129157
bmw,132572
ford,124266
mercedes_benz,130788
opel,129310
volkswagen,128711


We see that the mileage is not really different for each brand. We can combine the price and mileage data and rearrange our dataframe by sorting it by the price. 

In [142]:
df_total=df_mileage
df_total["mean_price"] = df_price
df_total=df_total.sort_values(by=["mean_price"], ascending=False)
df_total=df_total[["mean_price","mean_mileage"]]
df_total

Unnamed: 0,mean_price,mean_mileage
audi,9336,129157
mercedes_benz,8628,130788
bmw,8332,132572
volkswagen,5604,128711
ford,4054,124266
opel,2975,129310


## Conlusion

During this project we explored and analized eBay car listings data, discovered some trends and details about brands, their mean prices and mileages.