### Import required Libraries

In [1]:
import numpy as np
import pandas as pd

### Series

Series is a one-dimensional labeled array capable of holding any data type (integers, strings, floating point numbers, Python objects, etc.). The axis labels are collectively referred to as the index. The basic method to create a Series is to call:

s = pd.Series(data, index=index)

Here, data can be many different things:

a Python dict

an ndarray

a scalar value (like 5)



### From n d array
If data is an ndarray, index must be the same length as data. If no index is passed, one will be created having values [0, ..., len(data) - 1].

In [2]:
s1 = pd.Series(data=np.random.randn(5), index=['a', 'b', 'c', 'd', 'e'])    # series with data and label
s1

a   -0.824727
b   -0.908839
c    1.776761
d   -1.010076
e    0.701787
dtype: float64

In [3]:
s2 = pd.Series(data=np.random.randn(5))   # series with data and no index
s2

0   -0.968215
1    0.550617
2   -0.219792
3   -0.127911
4    0.096965
dtype: float64

In [4]:
s1.index

Index(['a', 'b', 'c', 'd', 'e'], dtype='object')

### From Dictionary

In [5]:
d = {"name": "viswam", "place": "kadapa", "dept": "ece"}

s3 = pd.Series(d)
s3

name     viswam
place    kadapa
dept        ece
dtype: object

In [6]:
s4 = pd.Series(data=d, index=["place", "name", "dept", "dept1"])   # Explicit index checks for keys in dictionary. 
                                                                   # If key is not there then NaN is assigned as value

    # NaN(Not a Number) is the standard missing data marker used in pandas
s4

place    kadapa
name     viswam
dept        ece
dept1       NaN
dtype: object

In [7]:
s

NameError: name 's' is not defined

In [None]:
# Get data type
s.dtype

In [None]:
# Slicing
s[0]

In [None]:
s[:3]

In [None]:
s.median()

In [None]:
s[s > s.median()]

In [None]:
s[s<s.median()]

In [None]:
s[[4, 3, 1]]

In [None]:
np.exp(s)

In [None]:
#  If you need the actual array backing a Series, use Series.array.
s.array

In [None]:
# If series is an nd array like if you need an actual ndarray, then use Series.to_numpy().
s.to_numpy()

### Series is dict-like

A Series is like a fixed-size dict in that you can get and set values by index label:

In [None]:
s

In [None]:
s['a']   # Get value using index label

In [None]:
"e" in s

In [None]:
"f" in s

In [None]:
s["f"]    # If key is not present raises KeyError

In [None]:
s.get("f", np.NaN)

In [None]:
s + s

In [None]:
s * 2

In [None]:
np.exp(s)

A key difference between Series and ndarray is that operations between Series automatically align the data based on label. Thus, you can write computations without giving consideration to whether the Series involved have the same labels.

In [None]:
s[1:]

In [None]:
s[:-1]

In [None]:
s[1:] + s[:-1]

The result of an operation between unaligned Series will have the union of the indexes involved. If a label is not found in one Series or the other, the result will be marked as missing NaN. Being able to write code without doing any explicit data alignment grants immense freedom and flexibility in interactive data analysis and research. The integrated data alignment features of the pandas data structures set pandas apart from the majority of related tools for working with labeled data.

### name atribute

In [None]:
s1 = pd.Series(data=[1,2,3,4], name='sample series')
s1

### Rename the series

In [None]:
s2= s1.rename('different')

In [None]:
s1, s2

# Dataframes

DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. You can think of it like a spreadsheet or SQL table, or a dict of Series objects. It is generally the most commonly used pandas object. Like Series, DataFrame accepts many different kinds of input:

Dict of 1D ndarrays, lists, dicts, or Series

2-D numpy.ndarray

Structured or record ndarray

A Series

Another DataFrame


### From dict of Series or dicts

In [13]:
d = {
    "one": pd.Series([1.0, 2.0, 3.0], index=["a", "b", "c"]),
    "two": pd.Series([1.0, 2.0, 3.0, 4.0], index=["a", "b", "c", "d"]),
}

df1 = pd.DataFrame(d)
df1

Unnamed: 0,one,two
a,1.0,1.0
b,2.0,2.0
c,3.0,3.0
d,,4.0


In [15]:
df2 = pd.DataFrame(d, index=["d", "b", "a"])
df2

Unnamed: 0,one,two
d,,4.0
b,2.0,2.0
a,1.0,1.0


In [17]:
df3 = pd.DataFrame(d, index=["d", "b", "a"], columns=["two", "three"])
df3

Unnamed: 0,two,three
d,4.0,
b,2.0,
a,1.0,


### The row and coclumn attributes can be accessed using index and columns

In [24]:
df1.columns, df1.index

(Index(['one', 'two'], dtype='object'),
 Index(['a', 'b', 'c', 'd'], dtype='object'))

### From dict of nd arrays

In [26]:
d = {"one": [1, 2, 3, 4], "two": [5, 6, 7, 8]}
df1= pd.DataFrame(d)
df1

Unnamed: 0,one,two
0,1,5
1,2,6
2,3,7
3,4,8


In [28]:
df2 = pd.DataFrame(d, index=['a', 'b', 'c', 'd'])    # give custom index
df2

Unnamed: 0,one,two
a,1,5
b,2,6
c,3,7
d,4,8


### from structured or record array

In [29]:
data = [(1, 2.0, "Hello"), (2, 3.0, "World")]

In [30]:
data

[(1, 2.0, 'Hello'), (2, 3.0, 'World')]

In [32]:
df1 = pd.DataFrame(data)
df1

Unnamed: 0,0,1,2
0,1,2.0,Hello
1,2,3.0,World


In [33]:
df2 = pd.DataFrame(np.arange(6))
df2

Unnamed: 0,0
0,0
1,1
2,2
3,3
4,4
5,5


### From list of dicts

In [38]:
data2 = [{"a": 1, "b": 2}, {"a": 5, "b": 10, "c": 20}]
data2
df1 = pd.DataFrame(data2)
df1

Unnamed: 0,a,b,c
0,1,2,
1,5,10,20.0


In [39]:
dict([("A", [1, 2, 3]), ("B", [4, 5, 6])])

{'A': [1, 2, 3], 'B': [4, 5, 6]}

### Column selection, addition, deletion

In [40]:
df1

Unnamed: 0,a,b,c
0,1,2,
1,5,10,20.0


In [41]:
df1["a"]    # selecting a column

0    1
1    5
Name: a, dtype: int64

In [53]:
df1["d"] = df1["a"] * df1["b"]    # Adding column

In [54]:
df1

Unnamed: 0,a,b,c,d
0,1,2,,2
1,5,10,20.0,50


In [55]:
df1['flag'] = df1["a"] > df1["b"]
df1

Unnamed: 0,a,b,c,d,flag
0,1,2,,2,False
1,5,10,20.0,50,False


In [56]:
del df1['flag']        # Removing column
df1

Unnamed: 0,a,b,c,d
0,1,2,,2
1,5,10,20.0,50


In [57]:
d = df1.pop("d")       # deleting column using pop
print(d)
print(df1)

0     2
1    50
Name: d, dtype: int64
   a   b     c
0  1   2   NaN
1  5  10  20.0


In [58]:
df1

Unnamed: 0,a,b,c
0,1,2,
1,5,10,20.0


In [59]:
df1["foo"] = "bar"    # When inserting a scalar value, it will naturally be propagated to fill the column:

In [60]:
df1

Unnamed: 0,a,b,c,foo
0,1,2,,bar
1,5,10,20.0,bar


In [64]:
df1["foo"][::]

0    bar
1    bar
Name: foo, dtype: object

In [66]:
#  By default insert adds col at the end

df1.insert(1, 'bar', df1['a'])
df1

Unnamed: 0,a,bar,b,c,foo
0,1,1,2,,bar
1,5,5,10,20.0,bar


In [None]:
df1.insert()