# **Python Pandas - Introduction**

Pandas is an open-source Python Library providing high-performance data manipulation and analysis tool using its powerful data structures.

Pandas deals with the following three data structures


**Series**

**DataFrame**

**Panel** 


<table class="table table-bordered">
<tr>
<th style="text-align:center;">Data Structure</th>
<th style="text-align:center;">Dimensions</th>
<th style="text-align:center;">Description</th>
</tr>
<tr>
<td style="text-align:center;">Series</td>
<td style="text-align:center;">1</td>
<td style="text-align:center;">1D labeled homogeneous array, sizeimmutable.</td>
</tr>
<tr>
<td style="text-align:center;">Data Frames</td>
<td style="text-align:center;">2</td>
<td style="text-align:center;">General 2D labeled, size-mutable tabular structure with potentially heterogeneously typed
columns.</td>
</tr>
<tr>
<td style="text-align:center;">Panel</td>
<td style="text-align:center;">3</td>
<td style="text-align:center;">General 3D labeled, size-mutable array.</td>
</tr>
</table>

<h3>Mutability</h3>
<p>All Pandas data structures are value mutable (can be changed) and except Series all are size mutable. Series is size immutable.</p>
<p><b>Note</b> &minus; DataFrame is widely used and one of the most important data structures. Panel is used much less.</p>
<h2>Series</h2>
<p>Series is a one-dimensional array like structure with homogeneous data. For example, the following series is a collection of integers 10, 23, 56, …</p>
<table class="table table-bordered">
<tr>
<td style="text-align:center;">10</td>
<td style="text-align:center;">23</td>
<td style="text-align:center;">56</td>
<td style="text-align:center;">17</td>
<td style="text-align:center;">52</td>
<td style="text-align:center;">61</td>
<td style="text-align:center;">73</td>
<td style="text-align:center;">90</td>
<td style="text-align:center;">26</td>
<td style="text-align:center;">72</td>
</tr>
</table>
<h3>Key Points</h3>
<ul class="list">
<li>Homogeneous data</li>
<li>Size Immutable</li>
<li>Values of Data Mutable</li>
</ul>
<h2>DataFrame</h2>
<p>DataFrame is a two-dimensional array with heterogeneous data. For example,</p>
<table class="table table-bordered">
<tr>
<th style="text-align:center;">Name</th>
<th style="text-align:center;">Age</th>
<th style="text-align:center;">Gender</th>
<th style="text-align:center;">Rating</th>
</tr>
<tr>
<td style="text-align:center;">Steve</td>
<td style="text-align:center;">32</td>
<td style="text-align:center;">Male</td>
<td style="text-align:center;">3.45</td>
</tr>
<tr>
<td style="text-align:center;">Lia</td>
<td style="text-align:center;">28</td>
<td style="text-align:center;">Female</td>
<td style="text-align:center;">4.6</td>
</tr>
<tr>
<td style="text-align:center;">Vin</td>
<td style="text-align:center;">45</td>
<td style="text-align:center;">Male</td>
<td style="text-align:center;">3.9</td>
</tr>
<tr>
<td style="text-align:center;">Katie</td>
<td style="text-align:center;">38</td>
<td style="text-align:center;">Female</td>
<td style="text-align:center;">2.78</td>
</tr>
</table>

<h2>Panel</h2>
<p>Panel is a three-dimensional data structure with heterogeneous data. It is hard to represent the panel in graphical representation. But a panel can be illustrated as a container of DataFrame.</p>



# **Data Frames**

<h2>Create DataFrame</h2>
<p>A pandas DataFrame can be created using various inputs like &minus;</p>
<ul class="list">
<li>Lists</li>
<li>dict</li>
<li>Series</li>
<li>Numpy ndarrays</li>
<li>Another DataFrame</li>

In [0]:
#import the pandas library and aliasing as pd
import pandas as pd
df = pd.DataFrame()
print (df)

In [0]:
#Create a DataFrame from Lists
import pandas as pd
data = [1,2,3,4,5]
df = pd.DataFrame(data)
print (df)

In [0]:
import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'])
print (df)

In [0]:
import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'],dtype=float)
print (df)

In [0]:
#Create a DataFrame from Dict of ndarrays / Lists
import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data)
print (df)