## Pandas Series

#### What is a Series?

A Pandas Series is like a column in a table.

It is a one-dimensional array holding data of any type.

In [1]:
# Example

# Create a simple Pandas Series from a list:

import pandas as pd 

a = [1, 7, 2]

myvar = pd.Series(a)

print(myvar)

0    1
1    7
2    2
dtype: int64


### Labels

If nothing else is specified, the values are labeled with their index number. First value has index 0, second value has index 1 etc.

This label can be used to access a specified value.

Create Labels

With the index argument, you can name your own labels.

Example

Create your own labels:


In [2]:

a = [1, 7, 2]

myvar = pd.Series(a, index = ["x", "y", "z"]) # specify the index 

print(myvar)

x    1
y    7
z    2
dtype: int64


When you have created labels, you can access an item by referring to the label.

In [3]:
a[1] #Access the items  

7

In [7]:
a[2]

2

### Key/Value Objects as Series

You can also use a key/value object, like a dictionary, when creating a Series.

Example

Create a simple Pandas Series from a dictionary:



In [8]:
calories = {"day1": 420, "day2": 380, "day3": 390}

myvar = pd.Series(calories)

print(myvar)

day1    420
day2    380
day3    390
dtype: int64


When you have created labels, you can access an item by referring to the label.

Note: The keys of the dictionary become the labels.

To select only some of the items in the dictionary, use the index argument and specify only the items you want to include in the Series.

Example

Create a Series using only data from "day1" and "day2":

In [9]:
calories = {"day1": 420, "day2": 380, "day3": 390}

myvar = pd.Series(calories, index = ["day1", "day2"])

print(myvar)

day1    420
day2    380
dtype: int64


## DataFrames

Data sets in Pandas are usually multi-dimensional tables, called DataFrames.

'Series' is like a "column", a 'DataFrame' is the "whole table".

Example

Create a DataFrame from two Series:



In [12]:
data = {
  "calories": [420, 380, 390],
  "duration": [50, 40, 45]
}

myvar = pd.DataFrame(data)

print(myvar)

   calories  duration
0       420        50
1       380        40
2       390        45


### What is a DataFrame?

A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a table with rows and columns.

Example




In [14]:
# Create a simple Pandas DataFrame:

data = {
    'calories': [450,550,650,420],
    'duration': [30,50,20,60]
}

df = pd.DataFrame(data)

print(df) 



   calories  duration
0       450        30
1       550        50
2       650        20
3       420        60


### Locate Row

As you can see from the result above, the DataFrame is like a table with rows and columns.

Pandas use the loc attribute to return one or more specified row(s)

Example

Return row 0:

print(df.loc[0])

In [20]:
df.iloc[0]

calories    450
duration     30
Name: 0, dtype: int64

In [None]:
Note: This example returns a Pandas Series.



In [21]:
# Example

# Return row 0 and 1:

print(df.loc[[0, 1]])

   calories  duration
0       450        30
1       550        50


In [25]:
df.iloc[[1,2]]

Unnamed: 0,calories,duration
1,550,50
2,650,20


### Named Indexes

With the index argument, you can name your own indexes.

Example



In [None]:
Add a list of names to give each row a name:

data = {
  "calories": [420, 380, 390],
  "duration": [50, 40, 45]
}

In [39]:
df = pd.DataFrame(data, index = ["day1", "day2", "day3"])

print(df)

      calories  duration
day1       420        50
day2       380        40
day3       390        45


In [40]:
df

Unnamed: 0,calories,duration
day1,420,50
day2,380,40
day3,390,45


In [47]:
data = {
    'Age':[20,30,20],
    'Name':['shree','manju','ravi']
}

data = pd.DataFrame(data,index=['a','b','c'])
data

Unnamed: 0,Age,Name
a,20,shree
b,30,manju
c,20,ravi


### Locate Named Indexes

In [42]:
# Use the named index in the loc attribute to return the specified row(s).

# Example

# Return "day2":

print(df.loc["day2"])

calories    380
duration     40
Name: day2, dtype: int64


In [50]:
data.loc['a']

Age        20
Name    shree
Name: a, dtype: object

### Load Files Into a DataFrame

If your data sets are stored in a file, Pandas can load them into a DataFrame.

Example

Load a comma separated file (CSV file) into a DataFrame:



### Read CSV Files

A simple way to store big data sets is to use CSV files (comma separated files).

CSV files contains plain text and is a well know format that can be read by everyone including Pandas.

In our examples we will be using a CSV file called 'data.csv'.

In [None]:

df = pd.read_csv('data.csv')

print(df) 


You will learn more about importing files in the next chapters.