Skip to content

Pandas tutorial on Weather data for San Francisco Bay Area

Notifications You must be signed in to change notification settings

simongeek/PandasDA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Pandas + Seaborn tutorial on Weather data for San Francisco Bay Area in California

Project analysys the weather on San Francisco Bay Area region in California, exactly for cities like San Francisco, San Mateo, Santa Clara, Mountain View and San Jose. Data cleaning, manipulation and data transformation was done with use of Pandas - powerful Python data analysis toolkit. Addionaly there are many visualization, where some of them were prepared with matplotlib and seaborn library. This project will introduce us to the basics of Pandas concept such as:

Why Pandas and Seaborn?

  • You can easily pass Pandas Data Frame to Seaborn
  • Plot data from interesting columns or rows

Sample plots:

temperature temphum histogram

Simple Youtube presentation what type of visualization is generated:

Project desciption

For further analysis, parameters were choosen:

  • temperature [F]
  • humidity [%]
  • pressure [inHg]
  • wind speed [MPH]
  • gust speed [MPH]
  • cloud level [0-10]
  • visibility [%]
  • events such as rain, fog, thunderstorm

What will you learn?

You will learn:

  • How to read CSV files into Pandas Data Frame
  • How to clean the data, remove missing values, remove unused columns, replace names etc.
  • How to create plots, histograms and heat maps based on Pandas Data Frame

Project structure

The project contains two file, first contains raw CSV data taken from U.S. Government's open data website. The second file is Python script with all the pandas and seaborn code:

  • weather.csv - data file, generated from U.S. Government's open data website
  • main.py - main file with analysis and plots

Resources

Grab the code or run project in online IDE