There are two types of data structures in pandas - Series and Dataframes. Series functions similar to a numpy array but its a pandas functionality

The first main data type we will learn about for pandas is the Series data type. Let's import Pandas and explore the Series object.

A Series is very similar to a NumPy array (in fact it is built on top of the NumPy array object). What differentiates the NumPy array from a Series, is that a Series can have axis labels, meaning it can be indexed by a label, instead of just a number location. It also doesn't need to hold numeric data, it can hold any arbitrary Python Object.


In [12]:
import numpy as np
import pandas as pd

## Different ways of Creating a Series

You can create a series by: <br> 1.From a list <br> 2.From a numpy array <br> 3.From a dictionary

In [13]:
# Defining the labels for indexing and creating a list
labels = ['a','b','c']
my_list = [10,20,30]

In [14]:
# Defining an array and a sample dictionary to create series
arr = np.array([10,20,30])
d = {'a':10,'b':20,'c':30}

In [15]:
print("List:",my_list)
print("Numpy Array:",arr)
print("Dictionary:",d)

List: [10, 20, 30]
Numpy Array: [10 20 30]
Dictionary: {'a': 10, 'b': 20, 'c': 30}


<b> Create pandas series using Lists </b>

In [16]:
series1 = pd.Series(data=my_list)
series1

0    10
1    20
2    30
dtype: int64

In [17]:
#create a series with list and index as custom labels
pd.Series(data=my_list,index=labels)

a    10
b    20
c    30
dtype: int64

In [18]:
pd.Series(my_list,labels)

a    10
b    20
c    30
dtype: int64

In [19]:
pd.Series(labels,my_list)

10    a
20    b
30    c
dtype: object

In [20]:
pd.Series(index=labels,data = my_list)

a    10
b    20
c    30
dtype: int64

<b> Creating series with NumPy Arrays <b>

In [21]:
pd.Series(arr)

0    10
1    20
2    30
dtype: int32

In [22]:
pd.Series(arr,labels)

a    10
b    20
c    30
dtype: int32

<b> Creating a series using Dictionary <b>

In [23]:
d

{'a': 10, 'b': 20, 'c': 30}

In [24]:
pd.Series(d)

a    10
b    20
c    30
dtype: int64

In [25]:
# Defining the labels for indexing and creating a list
label2 = ['a','a','c']
list2 = [100,200,300]

In [27]:
ff = pd.Series(list2,label2)
ff

a    100
a    200
c    300
dtype: int64

In [29]:
ff["a"]

a    100
a    200
dtype: int64

In [35]:
ff[0]

100

### Data in a Series

A pandas Series can hold a variety of object types:

In [31]:
list3 = ["Alka",10,900.87,"Study"]

In [68]:
x1 = pd.Series(list3)
x1

0      Alka
1        10
2    900.87
3     Study
dtype: object

In [10]:
pd.Series(data=labels)

0    a
1    b
2    c
dtype: object

In [11]:
# Even functions (Less frequently used but good to know the functionality)
pd.Series([sum,print,len])

0      <built-in function sum>
1    <built-in function print>
2      <built-in function len>
dtype: object

In [43]:
list4 = [10,78,[1,4,5,7,8]]

In [44]:
pd.Series(list4)

0                 10
1                 78
2    [1, 4, 5, 7, 8]
dtype: object

In [47]:
list4[2]

[1, 4, 5, 7, 8]

## Using an Index in a Series

The key to using a Series is understanding its index. Pandas makes use of these index names or numbers by allowing for fast look ups of information.

In [52]:
s1 = pd.Series([10,20,30,40],index = ['USA', 'Germany','USSR', 'Japan'])                                   

In [53]:
s1

USA        10
Germany    20
USSR       30
Japan      40
dtype: int64

In [54]:
s2 = pd.Series([1,2,5,4],index = ['USA', 'Germany','Italy', 'Japan'])                                   

In [55]:
s2

USA        1
Germany    2
Italy      5
Japan      4
dtype: int64

In [56]:
s1['USA']

10

In [57]:
s2["Italy"]

5

In [66]:
s3 = pd.Series([100,200,500,400],index = ['USA', 'USA','Italy', 'Japan'])
s3

USA      100
USA      200
Italy    500
Japan    400
dtype: int64

In [64]:
s2

USA        1
Germany    2
Italy      5
Japan      4
dtype: int64

In [65]:
s2+s3

Germany      NaN
Italy      505.0
Japan      404.0
USA        101.0
USA        201.0
dtype: float64

In [67]:
s3+s3

USA       200
USA       400
Italy    1000
Japan     800
dtype: int64

Operations are based on the index:

In [58]:
s1 + s2

Germany    22.0
Italy       NaN
Japan      44.0
USA        11.0
USSR        NaN
dtype: float64

In [69]:
x1+x1

0      AlkaAlka
1            20
2       1801.74
3    StudyStudy
dtype: object