This project was completed as part of the Udacity Data Analyst Nanodegree program requirements.
Investigate the No Show appointments Database using NumPy, pandas and Matplotlib.
In this project, you will analyze a dataset and then communicate your findings about it. You will use the Python libraries NumPy, pandas, and Matplotlib.
You will need an installation of Python, plus the following libraries:
- pandas
- NumPy
- Matplotlib
- csv
For the final project, you will conduct your own data analysis and create a file to share that documents your findings. You should start by taking a look at your dataset and brainstorming what questions you could answer using it. Then you should use pandas and NumPy to answer the questions you are most interested in, and create a report sharing the answers. You will not be required to use inferential statistics or machine learning to complete this project, but you should make it clear in your communications that your findings are tentative.
For this project I chose the "No Show Appointments".
Brainstorm some questions you could answer using the data set you chose, then start answering those questions. Try and suggest questions that promote looking at relationships between multiple variables. You should aim to analyze at least one dependent variable and three independent variables in your investigation.
Once you have finished analyzing the data, create a report that shares the findings you found most interesting. If you use a Jupyter notebook, share your findings alongside the code you used to perform the analysis. Make sure that your report text is contained in Markdown cells to clearly distinguish your comments and findings from your code work. Make sure that you submit your report as an HTML or PDF file so that it can be opened easily.