# Pandas
## Pandas Series

* A series is similar to a 1-D numpy array, and contains values of the same type (numeric, character, datetime etc.). A dataframe is simply a table where each column is a pandas series.

* creating series 
    * List
    * Tuple
    * Dictionary
    * Numpy
    * Date_Range
* Series Indexing 

* Pandas is a built in library using for data analysis. You'll be using Pandas heavily for data manipulation, visualisation, building machine learning models, etc.


* Pandas implements a number of powerful data operations familiar to users of both database frameworks and spreadsheet programs.

* There are two main data structures in Pandas - Series and Dataframes. The default way to store data is dataframes, and thus manipulating dataframes quickly is probably the most important skill set for data analysis.

    Source: https://pandas.pydata.org/pandas-docs/stable/overview.html


In [1]:
import pandas as pd

In [2]:
# creating pandas series using list, tuple, dict, date range
# using list
li = [34,465,657,23,23,567,678]
s1 = pd.Series(li)
s1

0     34
1    465
2    657
3     23
4     23
5    567
6    678
dtype: int64

In [3]:
import numpy as np
n1= np.array(li)
n1

array([ 34, 465, 657,  23,  23, 567, 678])

In [4]:
s2 = pd.Series(n1)
s2

0     34
1    465
2    657
3     23
4     23
5    567
6    678
dtype: int32

In [5]:
n1[0]

34

In [11]:
s1.index = list("abcdefg")
s1

a     34
b    465
c    657
d     23
e     23
f    567
g    678
dtype: int64

In [13]:
s2 = pd.Series((2,3,1,34,"sdc", 567.5), index = ["x", "y", "z", 3,4, 90.99])
s2

x            2
y            3
z            1
3           34
4          sdc
90.99    567.5
dtype: object

In [14]:
s2.shape

(6,)

In [15]:
s2.size

6

In [16]:
s2.dtype # object/ str

dtype('O')

In [17]:
s2.ndim

1

# Slicing and Indexing
- accessing more than 1 element
- Accessing single element

In [20]:
s2['x']

2

In [22]:
s2[4]

'sdc'

In [23]:
s2["x":4]

TypeError: cannot do slice indexing on <class 'pandas.core.indexes.base.Index'> with these indexers [4] of <class 'int'>

In [24]:
s1

a     34
b    465
c    657
d     23
e     23
f    567
g    678
dtype: int64

In [26]:
s1["a":"d"] # explicit slicing

a     34
b    465
c    657
d     23
dtype: int64

In [27]:
s1[0:4] # implicit Slicing - default 0 to n-1

a     34
b    465
c    657
d     23
dtype: int64

In [28]:
s2

x            2
y            3
z            1
3           34
4          sdc
90.99    567.5
dtype: object

In [29]:
s2[["y", 3, "x", 4]] # fancy slicing

y      3
3     34
x      2
4    sdc
dtype: object

In [31]:
# creating a pandas series using a dictionary
di = {
    "Name" : "Supriya",
    "Surname" : "Kiladi",
    "Location" : "AP",
    "PIN" : 123
}
s4 = pd.Series(di)
s4

Name        Supriya
Surname      Kiladi
Location         AP
PIN             123
dtype: object

In [32]:
s4["Name"]

'Supriya'

In [33]:
s4["Location"]

'AP'

In [34]:
# creating series using date range
s5 = pd.date_range(start = "2021-09-20", end = "2021-10-10")
s5

DatetimeIndex(['2021-09-20', '2021-09-21', '2021-09-22', '2021-09-23',
               '2021-09-24', '2021-09-25', '2021-09-26', '2021-09-27',
               '2021-09-28', '2021-09-29', '2021-09-30', '2021-10-01',
               '2021-10-02', '2021-10-03', '2021-10-04', '2021-10-05',
               '2021-10-06', '2021-10-07', '2021-10-08', '2021-10-09',
               '2021-10-10'],
              dtype='datetime64[ns]', freq='D')

In [36]:
type(s5)

pandas.core.indexes.datetimes.DatetimeIndex

In [38]:
type(s4)

pandas.core.series.Series

In [39]:
s4.dtype

dtype('O')

In [40]:
s4

Name        Supriya
Surname      Kiladi
Location         AP
PIN             123
dtype: object

In [42]:
s6 = pd.Series([1,2,3,4,5])
s7 = pd.Series([4,5,6,7,8])
s6+s7

0     5
1     7
2     9
3    11
4    13
dtype: int64

In [44]:
min(s6)

1

In [45]:
max(s6)

5

In [48]:
s6.mean()

3.0

In [49]:
s6.var()

2.5

In [50]:
s6.std()

1.5811388300841898

In [56]:
c_s = pd.concat([s6,s7])
c_s.index = np.arange(1,11)
c_s

1     1
2     2
3     3
4     4
5     5
6     4
7     5
8     6
9     7
10    8
dtype: int64

# Task
#### create a pandas series having index values starts from 1 to 20 and the values are square of the index values

output:
1 1
2 4
3 9
4 16
5 25
.
.
.
.
20 400