Skip to content

gokhanturer/Portfolio_data_science_projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


CLRSWY


Exploratory Data Analysis (EDA) Project

EDA for Car Price Prediction Model


Descriptions :

  • A .json file containing a dataset consisting of 15919 rows and 54 columns is provided.
  • This dataset, scraped from the on-line car trading company in 2019, contains many features of 9 different car models.
  • The features (variables) of this dataset are too messy and distored.

What is expected of you?

  • Read the .json file and assign the dataset into a DataFrame using pandas.
  • Implement all aspects of the EDA process to the dataset.
    • Fix corrupted data formats,
    • Handle with outliers and missing values,
      • Domain (automobiles) knowledge is important.
      • Always use the internet to do the research that you need.
      • Think carefully to decide whether a data is outliers or not. Examples :
        • There is no conventional car model with an average fuel consumption of 1 - 1.5 liters per 100 km.
        • Or you need to know that it cannot be a 300 euro car.
        • Or if there is only one car with 3 doors out of the 15919 cars, this is what you should pay attention to and examine.
    • Drop the columns / rows you determined unnecessary as a result of your analysis,
    • Use visualization tools while doing all these processes.
  • As a result, get the dataset ready to provide an appropriate input to the ML models.
  • Save cleaned dataset into a .csv file.

Need to Study :

  • .str.method,
  • .contains(),
  • .extract(),
  • .to_datetime(),
  • .get_dummies(),
  • .add_prefix(),
  • .sample(),
  • regex,
  • .to_numeric(),
  • .isin(),
  • .corr().

About

Data Science Projects

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published