# Pandas Introduction
**The Pandas library is used for data manipulation and analysis.**

## Setup

### Installing Pandas

In [1]:
! pip3 install pandas



### Importing Pandas

In [2]:
import pandas as pd
pd.__version__

'2.1.4'

In [3]:
# importing numpy
import numpy as np
np.__version__

'1.26.4'

## Data Structures

### Series
A Series in Pandas is a one-dimensional labeled array, similar to a list but with labels attached to each element. This makes it easier to access and manipulate data. 

**Creating a Series from numpy ndarray**

In [4]:
s = pd.Series(np.random.randint(0, 100, 5))
s

0    40
1    11
2    57
3    45
4    69
dtype: int64

**Creating a Series from a list**

In [5]:
s = pd.Series([10, 20, 30, 40, 50])
s

0    10
1    20
2    30
3    40
4    50
dtype: int64

**Creating a Series from a dictionary**

In [6]:
s = pd.Series({'a': 10, 'b': 20, 'c': 30})
s

a    10
b    20
c    30
dtype: int64

**Creating a Series with custom index**

In [7]:
s = pd.Series([10, 20, 30, 40, 50], index=['a', 'b', 'c', 'd', 'e'])
s

a    10
b    20
c    30
d    40
e    50
dtype: int64

**Naming a Series**

In [8]:
ages = [30, 50, 23]
names = ['x', 'y', 'z']
s = pd.Series(ages, index=names, name='Age')
s

x    30
y    50
z    23
Name: Age, dtype: int64

### DataFrame
Pandas DataFrame is a two-dimensional data structure with labeled axes (rows and columns).

**Creating a DataFrame from a numpy array**

In [9]:
df = pd.DataFrame(np.random.rand(4, 3))
df

Unnamed: 0,0,1,2
0,0.26693,0.028483,0.68488
1,0.630467,0.802194,0.574913
2,0.827626,0.310717,0.795573
3,0.271426,0.433135,0.547501


In [10]:
df = pd.DataFrame(np.random.randint(0, 100, (5, 4)), columns=['A', 'B', 'C', 'D'])
df

Unnamed: 0,A,B,C,D
0,23,84,35,6
1,60,29,72,9
2,80,26,82,78
3,97,12,36,18
4,10,88,50,84


**Creating a DataFrame from dictionary/lists**

In [11]:
data = [[1, 2, 3, 4], [5, 6, 7, 8]]
df = pd.DataFrame(data, columns=['a', 'b', 'c', 'd'])
df

Unnamed: 0,a,b,c,d
0,1,2,3,4
1,5,6,7,8


In [12]:
data = {'A': [1, 2, 3, 4],
        'B': [5, 6, 7, 8]}
df = pd.DataFrame(data)
df

Unnamed: 0,A,B
0,1,5
1,2,6
2,3,7
3,4,8


**Creating a DataFrame from Series**

In [13]:
company = pd.Series(['Apple', 'Microsoft', 'Google'], name='Company')
revenue = pd.Series([274.5, 143.0, 182.5], name='Revenue')
profit = pd.Series([57.4, 44.3, 40.3], name='Profit')

# df = pd.DataFrame({
#         'Company': company,     
#         'Revenue': revenue,
#         'Profit': profit
# })
df = pd.DataFrame({
        company.name: company,
        revenue.name: revenue,
        profit.name: profit
})
df

Unnamed: 0,Company,Revenue,Profit
0,Apple,274.5,57.4
1,Microsoft,143.0,44.3
2,Google,182.5,40.3
