# Previewing a dataframe  
*Pandas* is a Python module (e.g., library of functions) for working with tabular data similar to a spreadsheet. An advantage is the ability to automate analyses in a way that spreadsheets require clicking and typing to repeat. Python can also handle significantly larger data sets than most spreadsheet software.  

First, you'll need to *import* the pandas module to use its functions. If you forget, the code below with output errors when it doesn't recognize a function it's trying to access (because it wasn't imported). 

In [None]:
# you only need to run this once, but it doesn't break anyting to run it multiple times
import pandas as pd

Now that pandas is imported, the .read_csv() function can read in a csv file with data. *Technical note*:  
- When pandas reads in a data file, it creates a *dataframe* which is like a data table.  
- You can name dataframes a generic name, like "data" or "data2", or something more meaningful. This example creates a dataframe calling it *penguins* since it has data about penguins.  
- Python variable names are case-sensitive, must begin with a letter, not contain spaces or special characters.  

For this example notebook, this read a file form the web, but you can read from a Google Sheet or upload a file from your computer.  

In [None]:
# this reads in a data file
penguins = pd.read_csv('https://github.com/mwaskom/seaborn-data/raw/master/penguins.csv')

# this shows the first few rows of the dataframe
# you can edit the code to show a different number of rows
# the .tail() funciton is similar, but shows the last few rows
penguins.head(3)

Want to know how big the dataframe is? For large data sets it may not be convenient to view the entire dataframe. Instead, you can use some functions to get a better idea of what your data look like.  

In [None]:
# the .shape function gives the number of (rows,columns) in the dataframe
penguins.shape

In [None]:
# the .describe() function gives descriptive statistics for each column of the dataframe
penguins.describe()

## Credits
This notebook was designed by [Adam LaMee](https://adamlamee.github.io/). The penguin data came from [Allison Horst](https://www.allisonhorst.com/) in R format and made into a csv for Seabon use by [Michael Waskom](https://github.com/mwaskom/seaborn). Thanks to the great folks at [Binder](https://mybinder.org/) and [Google Colaboratory](https://colab.research.google.com/notebooks/intro.ipynb) for making this notebook interactive without you needing to download it or install [Jupyter](https://jupyter.org/) on your own device.