Skip to content

lee-ngo/dataset-ice-fire

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

Introduction to Data Science by Lee Ngo

Exploring a Dataset of Ice and Fire

Hello, and welcome to an intentionally fun and light-hearted approach to learning data science for beginners! In this course, you will learn the following:

  • How to set up your environments like a proper data scientist
  • Learn the basics of the Python language for data science
  • Import popular Python libraries and data files
  • Perform some Exploratory Data Analysis (EDA)
  • Complete some Data Visualization

We will be exploring this dataset created for fun by Chris Albon (github.com/chrisalbon)! Yes, we're going to have a lot of fun with this one.

What You Need To Know First

This course is designed for complete beginners, but knowing the following will help tremendously:

If you have any suggestions on ways to make this class even more accessible, please suggest them!

Setting Up Your Computer

*To begin, we need to install some software on your computer that will run the following:

  • Python (we'll be working with Python 3, but Python 2 should be just fine, too)
  • Jupyter Notebooks, a great tool for iterating quickly in Python
  • Various Python libraries (we'll be using NumPy, Pandas, and Matplotlib)

Fortunately, we can do all of the above simply by installing one thing: Anaconda from Continuum Analytics

Please follow the link below to set up Anaconda on your computer.

We won't be sharing specific steps because it varies per computer.

https://continuum.io/downloads

Download the GitHub repository for this course

  1. Go to the top-right corner and find the "Clone or Download" button. Click on it.
  2. Click on Download ZIP and save that file somewhere on your computer.
  3. Unzip the file you just downloaded and remember where that directory is.

Initialize Jupyter Notebooks in your terminal

  1. Open up your Command Prompt, Terminal, Git, etc..
  2. Navigate to the folder/directory where you downloaded this GitHub repository.
  3. Type jupyter notebook into the command prompt and wait a moment.

If a webpage opens that says 'Jupyter' at the top, you're ready to move forward!

At this point, you can open up your own Jupyter Notebook or, if you must, use this pre-populated one here.

Why are we using Python instead of R or some other language? Ah, this old debate. Although others have a stronger opinion than I do on the matter, I've just happened to learn most of my data science via Python and discovered a great community around it. Generally, I code in JavaScript or Python, and anyone will tell you that it doesn't really matter what you code in as long as you know how to code.

If there's enough interest in converting this lesson into R or another langauge, perhaps I'll do it... (but probably not).

About this Course's Author

Lee Ngo is a self-described 'Education Technology Community Architect,' and is perpetually passionate about inclusivity, engagement, and empathy in spaces of professional advancement. Lee serves as national data science evangelist for Metis. Previously, Lee served as an evangelist for Galvanize based in Seattle. Previously he worked for UP Global (now Techstars) and founded his own ed-tech company in Pittsburgh, PA. Lee believes in learning by doing, engaging and sharing, and he teaches code through a combination of visual communication, teamwork, and project-oriented learning.

You can email him at lee-dot-ngo-at-gmail-dot-com for any further questions.

Disclaimer: This lesson is entirely open-source, unaffilated with any other entities and intended for educational and entertain purposes. The data used remains unchanged from its initial source out of respect to its author and the inspired material. Please feel free to fork, clone, remake, sample, and enjoy as your please under the MIT License.

About

Introduction to Data Science: Exploring a Dataset of Ice and Fire

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published