Skip to content

learn-co-students/dsc-pandas-data-cleaning-recap-nyc-ds-010620

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Getting Started with Data Science - Recap

Introduction

In this section, you learned about the Data Science process as well as the fundamentals of Python. You started off with learning about the basic Python data types as well as variable assignment. After that, you worked with collections of these basic data types, learning about lists and dictionaries. Finally, you learned about data visualization and used matplotlib to create some bar charts, histograms, and scatter plots.

Key Takeaways

  • There is a lot to learn about Data Science, but most of the time you're predicting a continuous value (regression), predicting a category (classification), identifying unusual data (anomaly detection) or generating recommendations.
  • Data Science is not just about selecting and tuning machine learning models. Much of the value comes from understanding the business needs and formulating the problem thoughtfully. And most of the effort is in the early stages of finding, cleaning, exploring, and simplifying the data so it's ready to be run against your models.
  • It's important to use professional tools. Jupyter Notebook is a great environment for combining your notes and your code. Git allows you to keep track of your changes. GitHub allows to share them with your team. Conda virtual environments ensure that the libraries you use for one project won't break another project you were working on.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published