#### Pandas Data Sturcture

 Data structures in Pandas are designed to handle data efficiently. They allow for the organization, storage, and modification of data in a way that optimizes memory usage and computational performance. Python Pandas library provides two primary data structures for handling and analyzing data −
1 Series
2 DataFrame

In [1]:
import pandas as pd
print("pandas import successfully")

pandas import successfully


### Series

A Series in Pandas is a one-dimensional labeled array capable of holding data of any type, including integers, floats, strings, and Python objects. It consists of two main components −

Data: The actual values stored in the Series.
Index: The labels or indices that correspond to each data value.

A series object can be created using various inputs like −
1. List.
2. ndarray.
3. Dict.
4. Scalar value or constant.

In [11]:
# Create using list
lst =pd.Series([10,20,30,40,50])
print(lst)

0    10
1    20
2    30
3    40
4    50
dtype: int64


In [None]:
# Create a Series from Python Dictionary
data = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd.Series(data,index=['b','c','x','a'])
print(s)

b    1.0
c    2.0
x    NaN
a    0.0
dtype: float64


Attributes of Series object

In [15]:
print("Data type",s.dtype)
print("Dimension",s.ndim)
print("Bytes",s.nbytes)
print("Shape",s.shape)
print("Size",s.size)
print("Values",s.values)
print("Valuecount",s.value_counts())

Data type float64
Dimension 1
Bytes 32
Shape (4,)
Size 4
Values [ 1.  2. nan  0.]
Valuecount 1.0    1
2.0    1
0.0    1
Name: count, dtype: int64


#### Converting Series to Other Objects
Following are the commonly used methods for converting Series into other formats −

   Method    -----------  Description
1. to_list()	---Converts the Series into a Python list.
2. to_numpy()	---Converts the Series into a NumPy array.
3. to_dict()	---Converts the Series into a dictionary.
4. to_frame()	---Converts the Series into a DataFrame.
5. to_string()	---Converts the Series into a string representation for display.

In [None]:
# Create a Pandas Series
s = pd.Series([1, 2, 3])
# Convert Series to a Python list
result = s.to_list()
print("Output:",result)
print("Output Type:", type(result))

Output: [1, 2, 3]
Output Type: <class 'list'>


In [3]:
s = pd.Series([1, 2, 3], index=['a', 'b', 'c'])

# Convert Series to a Python dictionary
result = s.to_dict()

print("Output:",result)
print("Output Type:", type(result))

Output: {'a': 1, 'b': 2, 'c': 3}
Output Type: <class 'dict'>


### DataFrame
A DataFrame in Python's pandas library is a two-dimensional labeled data structure that is used for data manipulation and analysis. It can handle different data types such as integers, floats, and strings. Each column has a unique label, and each row is labeled with a unique index value, which helps in accessing specific rows.

In [17]:
df = pd.DataFrame()
print(df)

Empty DataFrame
Columns: []
Index: []


1. Create a DataFrame from Lists


In [None]:
data = [1,2,3,4,5]
df = pd.DataFrame(data)
print(df)

   0
0  1
1  2
2  3
3  4
4  5


In [22]:
data1 = [['Alex',10],['Bob',12],['Clarke',13]]
df1 = pd.DataFrame(data1,columns=['Name','Age'])
print(df1)

     Name  Age
0    Alex   10
1     Bob   12
2  Clarke   13


2. Create a DataFrame from Dict of ndarrays / Lists

In [23]:
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data)
print(df)

    Name  Age
0    Tom   28
1   Jack   34
2  Steve   29
3  Ricky   42


In [25]:
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data, index=['rank1','rank2','rank3','rank4'])
print(df)

        Name  Age
rank1    Tom   28
rank2   Jack   34
rank3  Steve   29
rank4  Ricky   42


3. Create a DataFrame from List of Dicts

In [26]:
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
df = pd.DataFrame(data)
print(df)

   a   b     c
0  1   2   NaN
1  5  10  20.0


In [27]:
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]

#With two column indices, values same as dictionary keys
df1 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b'])

#With two column indices with one index with other name
df2 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b1'])
print(df1)
print(df2)

        a   b
first   1   2
second  5  10
        a  b1
first   1 NaN
second  5 NaN


4. Create a DataFrame from Dict of Series

In [28]:
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
   'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print(df)

   one  two
a  1.0    1
b  2.0    2
c  3.0    3
d  NaN    4


Accessing the DataFrame Rows Labels

In [29]:
# Create a DataFrame
df = pd.DataFrame({
    'Name': ['Steve', 'Lia', 'Vin', 'Katie'],
    'Age': [32, 28, 45, 38],
    'Gender': ['Male', 'Female', 'Male', 'Female'],
    'Rating': [3.45, 4.6, 3.9, 2.78]},
    index=['r1', 'r2', 'r3', 'r4'])

# Access the rows of the DataFrame
result = df.index
print('Output Accessed Row Labels:', result)

Output Accessed Row Labels: Index(['r1', 'r2', 'r3', 'r4'], dtype='object')
