Skip to content

This repository includes my Data Science Portfolio. It highlights my knowledge and skills in using Python, SQL, and R.

Notifications You must be signed in to change notification settings

frozenkorean/portfolio-data-science

Repository files navigation

portfolio-data-science

Nam-kyu's Data Science Portfolio

This repository includes my Data Science Portfolio. It highlights my knowledge and skills in using Python, SQL, and R. The projects here include the projects I have done for IBM Data Science Professional Certificate (Coursera), Programming for Data Science (UDacity) and others.

My portfolio does not include any projects or assignments that were more of 'fill in the blank with a line of code' types. I have only included projects and assignments where I had to put time and work to write and edit codes to make them work.

I have only summarized them here. Please check out the individual folders for more details.

Python

1. IBM_Data_Science_Capstone

Title: Health-promoting resources and life expectancy at birth for communities in Chicago.

Capstone project for the IBM Data Science Professional Certificate provided by Coursera.

  • Given Objective: Use Foursquare API and clustering to compare neighborhoods. Choose whatever location, data, and theme that seems interesting.
  • Packages Used: NumPy, Pandas, Matplotlib, Folium, Geopy, Scikit-learn, and more.
  • My Project: Using kmeans clustering, the communities were clustered into five groups based on the number of health resources (gathered using Foursquare API) under four categories, healthcare facilities, healthy food store, outdoor places, and sports facilities, and compared with average life expectancies. The results were displayed over a choropleth map of life expectancy of Chicago communities. The areas where most resources are needed and some outlier communities were identified.

2. Explore_US_Bikeshare_Data

Final project for the Python component of Programming for Data Science with Python Nanodegree on Udacity

  • Given Objective: Using the bikeshare data of 3 cities, Chicago, New York, and Washington, get user input and display some statistics.
  • Packages Used: NumPy and Pandas
  • My Project: User chooses one of the cities and chose whether or not to filter the data by month, day, both, or not at all. Every time there is a user input, checks are put into place so that the user can go back and input their choice again or quit the program. Various statistics, including customer characteristics, popular stations, and more, are calculated.

SQL

3. Investigate_Sakila_Movie_Database

Final project for the SQL component of Programming for Data Science with Python Nanodegree on Udacity

  • Given Objective: Using the Sakila Movie Database, come up with four queries answering your own business questions.
  • My Project: As a new manager, I divided customers into various groups based on number of rentals, total payment, and rentals with no returns. The VVIPs were ranked on their payment amount, and their favorite movie genres were shown to cater to them better. To increase customers, popular actors for the top countries with 20 or more customers were discovered to use in commercials. Finally, the monthly payment trend for the countries was analyzed.

About

This repository includes my Data Science Portfolio. It highlights my knowledge and skills in using Python, SQL, and R.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published