Skip to content

In this module, you'll complete a guided exploratory data analysis project, then conduct a second, unique data analysis/exploration project. the goal is to tell a unique and compelling story with data. Not just your analytical compatibilities, but your ability to communicate in a professional and engaging manner is key.

Notifications You must be signed in to change notification settings

HallofFame1/datafun-06-projects

Repository files navigation

Assignment Project 6

In this module, you'll complete a guided exploratory data analysis project, then conduct a second, unique data analysis/exploration project. the goal is to tell an unique and compelling story with data. Not just your analytical compatibilities, but your ability to communicate in a professional and engaging manner is key.

Chapters

This module requires the skills learned in previous chapters. The first, guided exploratory data project focuses on diamonds.csv and is based on in Exercise 9.16 beginning on page 352 of the text. The second is a project of your choice, related to your domain.

Get Started

  1. Create a new GitHub repo named datafun-06-projects.
  2. Git clone your new repo into your Documents folder.
  3. Always ensure your repo has the 3 basic files all our repos need:
  4. a good README.md,
  5. .gitignore, and
  6. about.py.
  7. Copy these from previous repos as needed.
  8. Update README.md to reflect the focus of this module.

Task 1. Begin with the End in Mind

Read the exercise. Begin considering what you'd like your second project to focus on / showcase.

Use your second project to show all the Python things you know.

  • Read from a data file.
  • Use statistics - mean, median, mode, standard deviation, variance for one or more of the numerical columns.
  • Use built-in functions min(), max(), len(), count of records, number of columns, others...
  • Create some custom functions, use some branching logic to transform and/or show only a part of the data.
  • Get some of your data into a list.
  • Use filter(), map(), and list comprehensions to clean and transform the data.
  • Use pandas
  • Use matplotlib hist
  • Strive to "tell a story" with data. Use a good title section and useful section headings to professionally present your work.

Task 2 - Diamonds Dataset

  1. Follow the instructions for Exercise 9.16 (starting pg. 350).
  2. Complete the exercise in a notebook.
  3. 1-Load: Get the file, store it in your repo, and load it into a DataFrame.
  4. 2-View: Display the first 7 rows and the last 7 rows.
  5. 3-Describe: Use the DataFrame describe() function to calculate basic descriptive statistics for all numeric columns.
  6. 4-Series: Use the Series method describe() to calculate the descriptive stats for all category/text columns.
  7. 4-Unique: Use the Series method unique() to get unique category values.
  8. 5-Histograms: Use the DataFrame's hist() function to create a histogram for each numerical column.
  9. Required: Use Section headings in your Markdown to make it clear that each of these sections are shown in your notebook. They should be numbered 1-5 and include the keyword shown above.
  10. Required: Include the title of the notebook, and your name and date at the top.
  11. Do these consistently. A heading and section titles is required in every notebook.

Task 2 Output

Document your results.

  • execute the completed notebook
  • export to html
  • include the html in your repo

Task 3 - Exploratory Data Project

  1. Use everything you've learned to conduct a unique data exploration project using some information related to your domain.
  2. Tell a story with data (do a web search to learn more).
  3. Use this project to feature all of the key skills learned - creating a professional notebook, writing a good README.md (do a web search).
  4. Include challenging Python programming aspects - find a reason to use filter(), map(), and list comprehensions.
  5. Have fun and make it unique.

Task 3 Output

Document your results.

  • execute the completed notebook
  • export to html
  • include the html in your repo

Task 4. Push Repo to GitHub

Use VS Code to commit and sync (push) your repo to GitHub - or in Git Bash or terminal, do the following.

git add . git commit -m "added code" git push origin main

Optional Task 5. Bonus

  1. As part of your second project, include a new library or module we won't have time to explore.
  2. Consider imageio, nltk, texatistic, textblob, wordcloud, or others.
  3. Basically, look for something that might interest you and see if you can learn it on your own and apply it to your domain/project.

Reflection (on your own)

  1. How comfortable are you starting a project in GitHub, cloning it down, exploring data, and getting it back into GitHub?
  2. What parts are still too challenging to be enjoyable?
  3. Add your suggestions in the discussion forum and we'll see if we can't clear up any issues so you feel ready to complete data analytics projects in Python on your own.

Submission Instructions

  1. Project Submission Instructions

Submit

Part 1 - Project

  1. Paste a clickable link to your public GitHub repo:
  2. Your domain:
  3. About how long did you spend on class this module:
  4. In general, how did it go:
  5. What was the most difficult part:
  6. What was most interesting:
  7. Did you do the optional bonus (y/n). How did it go - or why not?

Part 2 - Self Assessment

From the Module Overview, paste the numbered list of objectives and assess your ability on each as "Highly proficient", "Proficient", or "Not Proficient":

Module Objectives

At the end of this module students will be able to:

  1. Research tools like pandas, matplotlib, and seaborn (L02)
  2. Perform a guided exploratory data analysis project (L02)
  3. Plan and conduct a new data analytics project (L02)
  4. Ingest data (L02)
  5. Explore data (L02)
  6. Calculate descriptive statistics on data (L02)
  7. Visualize data and results (L02)
  8. Apply Python to achieve unique objectives (L02)
  9. Create and manage git repositories (L02)
  10. Employ git clone, git add, git commit, and git push - either through an IDE or at the command line (L02)
  11. Tell a story with data in a unique and compelling way. (L02)
  12. Communicate professionally (L02)

Checklist

  • Get Started
  • Task 1. Begin with the End in Mind
  • Task 2 - Diamonds Dataset
  • Task 3 Output
  • Task 4. Push Repo to GitHub
  • Optional Task 5. Bonus
  • Reflection (on your own)
  • Submission Instructions
  • Submit
  • Part 1 - Project
  • Part 2 - Self Assessment

About

In this module, you'll complete a guided exploratory data analysis project, then conduct a second, unique data analysis/exploration project. the goal is to tell a unique and compelling story with data. Not just your analytical compatibilities, but your ability to communicate in a professional and engaging manner is key.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published