Assignment Project 6

In this module, you'll complete a guided exploratory data analysis project, then conduct a second, unique data analysis/exploration project. the goal is to tell an unique and compelling story with data. Not just your analytical compatibilities, but your ability to communicate in a professional and engaging manner is key.

Chapters

This module requires the skills learned in previous chapters. The first, guided exploratory data project focuses on diamonds.csv and is based on in Exercise 9.16 beginning on page 352 of the text. The second is a project of your choice, related to your domain.

Get Started

Create a new GitHub repo named datafun-06-projects.
Git clone your new repo into your Documents folder.
Always ensure your repo has the 3 basic files all our repos need:
a good README.md,
.gitignore, and
about.py.
Copy these from previous repos as needed.
Update README.md to reflect the focus of this module.

Task 1. Begin with the End in Mind

Read the exercise. Begin considering what you'd like your second project to focus on / showcase.

Use your second project to show all the Python things you know.

Read from a data file.
Use statistics - mean, median, mode, standard deviation, variance for one or more of the numerical columns.
Use built-in functions min(), max(), len(), count of records, number of columns, others...
Create some custom functions, use some branching logic to transform and/or show only a part of the data.
Get some of your data into a list.
Use filter(), map(), and list comprehensions to clean and transform the data.
Use pandas
Use matplotlib hist
Strive to "tell a story" with data. Use a good title section and useful section headings to professionally present your work.

Task 2 - Diamonds Dataset

Follow the instructions for Exercise 9.16 (starting pg. 350).
Complete the exercise in a notebook.
1-Load: Get the file, store it in your repo, and load it into a DataFrame.
2-View: Display the first 7 rows and the last 7 rows.
3-Describe: Use the DataFrame describe() function to calculate basic descriptive statistics for all numeric columns.
4-Series: Use the Series method describe() to calculate the descriptive stats for all category/text columns.
4-Unique: Use the Series method unique() to get unique category values.
5-Histograms: Use the DataFrame's hist() function to create a histogram for each numerical column.
Required: Use Section headings in your Markdown to make it clear that each of these sections are shown in your notebook. They should be numbered 1-5 and include the keyword shown above.
Required: Include the title of the notebook, and your name and date at the top.
Do these consistently. A heading and section titles is required in every notebook.

Task 2 Output

Document your results.

execute the completed notebook
export to html
include the html in your repo

Task 3 - Exploratory Data Project

Use everything you've learned to conduct a unique data exploration project using some information related to your domain.
Tell a story with data (do a web search to learn more).
Use this project to feature all of the key skills learned - creating a professional notebook, writing a good README.md (do a web search).
Include challenging Python programming aspects - find a reason to use filter(), map(), and list comprehensions.
Have fun and make it unique.

Task 3 Output

Document your results.

execute the completed notebook
export to html
include the html in your repo

Task 4. Push Repo to GitHub

Use VS Code to commit and sync (push) your repo to GitHub - or in Git Bash or terminal, do the following.

git add . git commit -m "added code" git push origin main

Optional Task 5. Bonus

As part of your second project, include a new library or module we won't have time to explore.
Consider imageio, nltk, texatistic, textblob, wordcloud, or others.
Basically, look for something that might interest you and see if you can learn it on your own and apply it to your domain/project.

Reflection (on your own)

How comfortable are you starting a project in GitHub, cloning it down, exploring data, and getting it back into GitHub?
What parts are still too challenging to be enjoyable?
Add your suggestions in the discussion forum and we'll see if we can't clear up any issues so you feel ready to complete data analytics projects in Python on your own.

Submission Instructions

Project Submission Instructions

Submit

Part 1 - Project

Paste a clickable link to your public GitHub repo:
Your domain:
About how long did you spend on class this module:
In general, how did it go:
What was the most difficult part:
What was most interesting:
Did you do the optional bonus (y/n). How did it go - or why not?

Part 2 - Self Assessment

From the Module Overview, paste the numbered list of objectives and assess your ability on each as "Highly proficient", "Proficient", or "Not Proficient":

Module Objectives

At the end of this module students will be able to:

Research tools like pandas, matplotlib, and seaborn (L02)
Perform a guided exploratory data analysis project (L02)
Plan and conduct a new data analytics project (L02)
Ingest data (L02)
Explore data (L02)
Calculate descriptive statistics on data (L02)
Visualize data and results (L02)
Apply Python to achieve unique objectives (L02)
Create and manage git repositories (L02)
Employ git clone, git add, git commit, and git push - either through an IDE or at the command line (L02)
Tell a story with data in a unique and compelling way. (L02)
Communicate professionally (L02)

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
task-2		task-2
task-3		task-3
task-5		task-5
.gitignore		.gitignore
Project Summary- datafun-05-Hellen_Hamond.docx		Project Summary- datafun-05-Hellen_Hamond.docx
Project Summary- datafun-06-Hellen_Hamond.docx		Project Summary- datafun-06-Hellen_Hamond.docx
README.md		README.md
about.py		about.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Assignment Project 6

Chapters

Get Started

Task 1. Begin with the End in Mind

Task 2 - Diamonds Dataset

Task 2 Output

Task 3 - Exploratory Data Project

Task 3 Output

Task 4. Push Repo to GitHub

Optional Task 5. Bonus

Reflection (on your own)

Submission Instructions

Submit

Part 1 - Project

Part 2 - Self Assessment

Module Objectives

Checklist

About

Releases

Packages

Languages

HallofFame1/datafun-06-projects

Folders and files

Latest commit

History

Repository files navigation

Assignment Project 6

Chapters

Get Started

Task 1. Begin with the End in Mind

Task 2 - Diamonds Dataset

Task 2 Output

Task 3 - Exploratory Data Project

Task 3 Output

Task 4. Push Repo to GitHub

Optional Task 5. Bonus

Reflection (on your own)

Submission Instructions

Submit

Part 1 - Project

Part 2 - Self Assessment

Module Objectives

Checklist

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages