![](https://camo.githubusercontent.com/981d48e57e23a4907cebc4eb481799b5882595ea978261f22a3e131dcd6ebee6/68747470733a2f2f70616e6461732e7079646174612e6f72672f7374617469632f696d672f70616e6461732e737667)

# pandas: powerful Python data analysis toolkit

## What is it?

**pandas** is a Python package that provides fast, flexible, and expressive data
structures designed to make working with "relational" or "labeled" data both
easy and intuitive. It aims to be the fundamental high-level building block for
doing practical, **real world** data analysis in Python. Additionally, it has
the broader goal of becoming **the most powerful and flexible open source data
analysis / manipulation tool available in any language**. It is already well on
its way towards this goal.

## Main Features
Here are just a few of the things that pandas does well:

  - Easy handling of [**missing data**][missing-data] (represented as
    `NaN`, `NA`, or `NaT`) in floating point as well as non-floating point data
  - Size mutability: columns can be [**inserted and
    deleted**][insertion-deletion] from DataFrame and higher dimensional
    objects
  - Automatic and explicit [**data alignment**][alignment]: objects can
    be explicitly aligned to a set of labels, or the user can simply
    ignore the labels and let `Series`, `DataFrame`, etc. automatically
    align the data for you in computations
  - Powerful, flexible [**group by**][groupby] functionality to perform
    split-apply-combine operations on data sets, for both aggregating
    and transforming data
  - Make it [**easy to convert**][conversion] ragged,
    differently-indexed data in other Python and NumPy data structures
    into DataFrame objects
  - Intelligent label-based [**slicing**][slicing], [**fancy
    indexing**][fancy-indexing], and [**subsetting**][subsetting] of
    large data sets
  - Intuitive [**merging**][merging] and [**joining**][joining] data
    sets
  - Flexible [**reshaping**][reshape] and [**pivoting**][pivot-table] of
    data sets
  - [**Hierarchical**][mi] labeling of axes (possible to have multiple
    labels per tick)
  - Robust IO tools for loading data from [**flat files**][flat-files]
    (CSV and delimited), [**Excel files**][excel], [**databases**][db],
    and saving/loading data from the ultrafast [**HDF5 format**][hdfstore]
  - [**Time series**][timeseries]-specific functionality: date range
    generation and frequency conversion, moving window statistics,
    date shifting and lagging


   [missing-data]: https://pandas.pydata.org/pandas-docs/stable/user_guide/missing_data.html
   [insertion-deletion]: https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html#column-selection-addition-deletion
   [alignment]: https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html?highlight=alignment#intro-to-data-structures
   [groupby]: https://pandas.pydata.org/pandas-docs/stable/user_guide/groupby.html#group-by-split-apply-combine
   [conversion]: https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html#dataframe
   [slicing]: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#slicing-ranges
   [fancy-indexing]: https://pandas.pydata.org/pandas-docs/stable/user_guide/advanced.html#advanced
   [subsetting]: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#boolean-indexing
   [merging]: https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html#database-style-dataframe-or-named-series-joining-merging
   [joining]: https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html#joining-on-index
   [reshape]: https://pandas.pydata.org/pandas-docs/stable/user_guide/reshaping.html
   [pivot-table]: https://pandas.pydata.org/pandas-docs/stable/user_guide/reshaping.html
   [mi]: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#hierarchical-indexing-multiindex
   [flat-files]: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#csv-text-files
   [excel]: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#excel-files
   [db]: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#sql-queries
   [hdfstore]: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#hdf5-pytables
   [timeseries]: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#time-series-date-functionality

## Where to get it
The source code is currently hosted on GitHub at:
https://github.com/pandas-dev/pandas

Binary installers for the latest released version are available at the [Python
Package Index (PyPI)](https://pypi.org/project/pandas) and on [Conda](https://docs.conda.io/en/latest/).

```sh
# conda
conda install pandas
```

```sh
# or PyPI
pip install pandas
```

## Dependencies
- [NumPy - Adds support for large, multi-dimensional arrays, matrices and high-level mathematical functions to operate on these arrays](https://www.numpy.org)
- [python-dateutil - Provides powerful extensions to the standard datetime module](https://dateutil.readthedocs.io/en/stable/index.html)
- [pytz - Brings the Olson tz database into Python which allows accurate and cross platform timezone calculations](https://github.com/stub42/pytz)

See the [full installation instructions](https://pandas.pydata.org/pandas-docs/stable/install.html#dependencies) for minimum supported versions of required, recommended and optional dependencies.



# Importing the library

In [1]:
import pandas as pd

In [2]:
df=pd.read_csv("../input/sharktankindia/ShartankIndiaAllPitches.csv")

In [3]:
df.head()

Unnamed: 0,Episode Number,Pitch Number,Brand,Idea,Investment Amount (In Lakhs INR),Debt (In lakhs INR),Equity,Anupam,Ashneer,Namita,Aman,Peyush,Vineeta,Ghazal,Season
0,1,1,BluePine Industries,Frozen Momos,75,0,18%,N,Y,N,Y,N,Y,N,1
1,1,2,Booz scooters,Renting e-bike for mobility in private spaces,40,0,50%,N,Y,N,N,N,Y,N,1
2,1,3,Heart up my Sleeves,Detachable Sleeves,25,0,30%,Y,N,N,N,N,Y,N,1
3,2,4,Tagz Foods,Healthy Potato Chips,70,0,2.75%,N,Y,N,N,N,N,N,1
4,2,5,Head and Heart,Brain Development Course,0,0,0,N,N,N,N,N,N,N,1


In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 117 entries, 0 to 116
Data columns (total 15 columns):
 #   Column                             Non-Null Count  Dtype 
---  ------                             --------------  ----- 
 0   Episode Number                     117 non-null    int64 
 1   Pitch Number                       117 non-null    int64 
 2   Brand                              117 non-null    object
 3   Idea                               117 non-null    object
 4   Investment Amount (In Lakhs INR)   117 non-null    int64 
 5   Debt (In lakhs INR)                117 non-null    int64 
 6   Equity                             117 non-null    object
 7   Anupam                             117 non-null    object
 8   Ashneer                            117 non-null    object
 9   Namita                             117 non-null    object
 10  Aman                               117 non-null    object
 11  Peyush                             117 non-null    object
 12  Vineeta 

# Creating Data of List of countries by tax revenue to GDP ratio
[Wikipedia](https://en.wikipedia.org/wiki/List_of_countries_by_tax_revenue_to_GDP_ratio)

In [5]:
df2=pd.read_html("https://en.wikipedia.org/wiki/List_of_countries_by_tax_revenue_to_GDP_ratio")

In [6]:
df2

[    0                                                  1
 0 NaN  This article needs to be updated. The reason g...,
                                                     0
 0                                 Part of a series on
 1                                            Taxation
 2                          An aspect of fiscal policy
 3   Policies Government revenue Property tax equal...
 4   Economics General Theory Price effect Excess b...
 5                                      General Theory
 6   Price effect Excess burden Tax incidence Laffe...
 7                                 Distribution of Tax
 8   Tax rate Flat Progressive Regressive Proportional
 9   Collection Revenue service Revenue stamp Tax a...
 10  Noncompliance General Tax avoidance Tax evasio...
 11                                            General
 12  Tax avoidance Tax evasion Tax resistance Tax s...
 13                                          Corporate
 14  Tax inversion Transfer mispricing Base erosion...
 15

In [7]:
type(df2)

list

In [8]:
df2[0]

Unnamed: 0,0,1
0,,This article needs to be updated. The reason g...


In [9]:
df2[1]

Unnamed: 0,0
0,Part of a series on
1,Taxation
2,An aspect of fiscal policy
3,Policies Government revenue Property tax equal...
4,Economics General Theory Price effect Excess b...
5,General Theory
6,Price effect Excess burden Tax incidence Laffe...
7,Distribution of Tax
8,Tax rate Flat Progressive Regressive Proportional
9,Collection Revenue service Revenue stamp Tax a...


In [10]:
df2[2]

Unnamed: 0,General Theory
0,Price effect Excess burden Tax incidence Laffe...
1,Distribution of Tax
2,Tax rate Flat Progressive Regressive Proportional


In [11]:
df2[3]

Unnamed: 0,General
0,Tax avoidance Tax evasion Tax resistance Tax s...
1,Corporate
2,Tax inversion Transfer mispricing Base erosion...
3,Locations
4,Tax havens Corporate havens Offshore financial...
5,Major examples
6,Ireland as a tax haven Ireland v Commission Le...


In [12]:
df2[4]

Unnamed: 0,Academic
0,Mihir A. Desai Dhammika Dharmapala James R. Hi...
1,Advocacy groups
2,Tax Justice Network (TJN) Institute on Taxatio...


In [13]:
df2[5]

Unnamed: 0,All Countries
0,List of countries by tax rates Tax revenue to ...
1,Individual Countries
2,Albania Algeria Argentina Armenia Australia Az...


In [14]:
df2[6]

Unnamed: 0,Region,Country Name,Tax Revenue (% of GDP),"GDP (Billions, PPP)",Tax Revenue (Billions),Gov't Expenditure (% of GDP(nominal)),Public Debt (% of GDP(nominal))
0,Europe,France,46.2,"$2,962.8","$1,368.81",56.4,98.6
1,Europe,Denmark,46.0,$301.3,$138.60,51.7,34.3
2,Europe,Belgium,44.6,$550.5,$245.52,52.5,101.4
3,Europe,Sweden,44.0,$542.0,$238.48,49.7,39.0
4,Europe,Finland,43.3,$256.5,$111.06,54.4,60.5
...,...,...,...,...,...,...,...
181,Asia-Pacific,North Korea,,$40.00,,,
182,Middle East and North Africa,Libya,,$74.7,,105.8,4.9
183,Europe,Liechtenstein,,,,,
184,Sub-Saharan Africa,Somalia,,$21.2,,,


In [15]:
gdp=df2[6]

In [16]:
gdp

Unnamed: 0,Region,Country Name,Tax Revenue (% of GDP),"GDP (Billions, PPP)",Tax Revenue (Billions),Gov't Expenditure (% of GDP(nominal)),Public Debt (% of GDP(nominal))
0,Europe,France,46.2,"$2,962.8","$1,368.81",56.4,98.6
1,Europe,Denmark,46.0,$301.3,$138.60,51.7,34.3
2,Europe,Belgium,44.6,$550.5,$245.52,52.5,101.4
3,Europe,Sweden,44.0,$542.0,$238.48,49.7,39.0
4,Europe,Finland,43.3,$256.5,$111.06,54.4,60.5
...,...,...,...,...,...,...,...
181,Asia-Pacific,North Korea,,$40.00,,,
182,Middle East and North Africa,Libya,,$74.7,,105.8,4.9
183,Europe,Liechtenstein,,,,,
184,Sub-Saharan Africa,Somalia,,$21.2,,,


In [17]:
gdp.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 186 entries, 0 to 185
Data columns (total 7 columns):
 #   Column                                 Non-Null Count  Dtype  
---  ------                                 --------------  -----  
 0   Region                                 186 non-null    object 
 1   Country Name                           186 non-null    object 
 2   Tax Revenue (% of GDP)                 181 non-null    float64
 3   GDP (Billions, PPP)                    185 non-null    object 
 4   Tax Revenue (Billions)                 181 non-null    object 
 5   Gov't Expenditure (% of GDP(nominal))  182 non-null    float64
 6   Public Debt (% of GDP(nominal))        183 non-null    float64
dtypes: float64(3), object(4)
memory usage: 10.3+ KB


In [18]:
gdp.isnull().sum()

Region                                   0
Country Name                             0
Tax Revenue (% of GDP)                   5
GDP (Billions, PPP)                      1
Tax Revenue (Billions)                   5
Gov't Expenditure (% of GDP(nominal))    4
Public Debt (% of GDP(nominal))          3
dtype: int64

In [19]:
gdp.describe()

Unnamed: 0,Tax Revenue (% of GDP),Gov't Expenditure (% of GDP(nominal)),Public Debt (% of GDP(nominal))
count,181.0,182.0,183.0
mean,21.741326,32.404121,56.977049
std,10.261353,13.890722,34.891268
min,1.4,10.8,0.0
25%,13.8,22.65,35.95
50%,20.5,31.125,50.5
75%,29.1,38.775,70.65
max,46.2,125.7,237.1


# Exporting as CSV file  

In [20]:
gdp.to_csv("gdp_data.csv")