**What is a DataFrame?**
* A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a table with rows and columns.

In [1]:
import pandas as pd

In [2]:
# Create a simple Pandas DataFrame:
data =    {
           "Protein": [38, 27, 18, 25, 29],
           "Fats": [32, 39, 40, 35, 33],
           "Carbs": [46, 47, 50, 60, 65]
          }

#load data into a DataFrame object:
df = pd.DataFrame(data)

print(df) 

   Protein  Fats  Carbs
0       38    32     46
1       27    39     47
2       18    40     50
3       25    35     60
4       29    33     65


**Locate Row**
* As you can see from the result above, the DataFrame is like a table with rows and columns.
* Pandas use the loc attribute to return one or more specified row(s)

In [3]:
#refer to the row index:
print(df.loc[0])    # Return row 0:

Protein    38
Fats       32
Carbs      46
Name: 0, dtype: int64


In [4]:
#refer to the row index:
print(df.loc[2])    # Return row 2:

Protein    18
Fats       40
Carbs      50
Name: 2, dtype: int64


In [5]:
# Return row 0 and 1:
print(df.loc[[2, 3]])

   Protein  Fats  Carbs
2       18    40     50
3       25    35     60


**Named Indexes**
* With the index argument, you can name your own indexes.

In [6]:
# Add a list of names to give each row a name:

data = {
        "Protein": [38, 27, 18, 25, 29],
        "Fats": [32, 39, 40, 35, 33],
        "Carbs": [46, 47, 50, 60, 65]
       }

df = pd.DataFrame(data, index = ["P1", "P2", "P3", "P4", "P5"])

print(df) 

    Protein  Fats  Carbs
P1       38    32     46
P2       27    39     47
P3       18    40     50
P4       25    35     60
P5       29    33     65


In [7]:
# Refer to the row index:
print(df.loc["P4"])    # Return row 4:

Protein    25
Fats       35
Carbs      60
Name: P4, dtype: int64


**Load Files Into a DataFrame**
* If your data sets are stored in a file, Pandas can load them into a DataFrame.

In [8]:
# Load a comma separated file (CSV file) into a DataFrame:
df = pd.read_csv('/content/Iris.csv')
print(df) 

      Id  SepalLengthCm  ...  PetalWidthCm         Species
0      1            5.1  ...           0.2     Iris-setosa
1      2            4.9  ...           0.2     Iris-setosa
2      3            4.7  ...           0.2     Iris-setosa
3      4            4.6  ...           0.2     Iris-setosa
4      5            5.0  ...           0.2     Iris-setosa
..   ...            ...  ...           ...             ...
145  146            6.7  ...           2.3  Iris-virginica
146  147            6.3  ...           1.9  Iris-virginica
147  148            6.5  ...           2.0  Iris-virginica
148  149            6.2  ...           2.3  Iris-virginica
149  150            5.9  ...           1.8  Iris-virginica

[150 rows x 6 columns]


**Read CSV Files**
* A simple way to store big data sets is to use CSV files (comma separated files).
* CSV files contains plain text and is a well know format that can be read by everyone including Pandas.


In [9]:
# Load the CSV into a DataFrame:
df = pd.read_csv('/content/Iris.csv')
print(df.to_string()) # to_string() use to print the entire DataFrame.

      Id  SepalLengthCm  SepalWidthCm  PetalLengthCm  PetalWidthCm          Species
0      1            5.1           3.5            1.4           0.2      Iris-setosa
1      2            4.9           3.0            1.4           0.2      Iris-setosa
2      3            4.7           3.2            1.3           0.2      Iris-setosa
3      4            4.6           3.1            1.5           0.2      Iris-setosa
4      5            5.0           3.6            1.4           0.2      Iris-setosa
5      6            5.4           3.9            1.7           0.4      Iris-setosa
6      7            4.6           3.4            1.4           0.3      Iris-setosa
7      8            5.0           3.4            1.5           0.2      Iris-setosa
8      9            4.4           2.9            1.4           0.2      Iris-setosa
9     10            4.9           3.1            1.5           0.1      Iris-setosa
10    11            5.4           3.7            1.5           0.2      Iris

In [10]:
# Print a reduced sample:
df = pd.read_csv('/content/Iris.csv')
print(df) 

      Id  SepalLengthCm  ...  PetalWidthCm         Species
0      1            5.1  ...           0.2     Iris-setosa
1      2            4.9  ...           0.2     Iris-setosa
2      3            4.7  ...           0.2     Iris-setosa
3      4            4.6  ...           0.2     Iris-setosa
4      5            5.0  ...           0.2     Iris-setosa
..   ...            ...  ...           ...             ...
145  146            6.7  ...           2.3  Iris-virginica
146  147            6.3  ...           1.9  Iris-virginica
147  148            6.5  ...           2.0  Iris-virginica
148  149            6.2  ...           2.3  Iris-virginica
149  150            5.9  ...           1.8  Iris-virginica

[150 rows x 6 columns]


**Read JSON**
* Big data sets are often stored, or extracted as JSON.
* JSON is plain text, but has the format of an object, and is well known in the world of programming, including Pandas.


In [12]:
df = pd.read_json('/content/EmployeeData.json')
print(df) 

      id                 name  ... creditBalance myCash
0   4051                manoj  ...           127      0
1   4050               pankaj  ...             0      0
2   3050           Neeraj1993  ...             0      0
3   3049               Sophia  ...            36      0
4   3048          Raju Prasad  ...             5      0
5   3047    Ankiish Thapliyal  ...           167      0
6   3046      Aryan Thapliyal  ...           160      0
7   3045               shivam  ...            11      0
8   3044       Navya Upadhyay  ...             5      0
9   3043           Ritu singh  ...             5      0
10  3042                 faiz  ...            38      0
11  3041        Martin Wilson  ...            18      0
12  3040         Shweta Singh  ...            77      0
13  3039               jagjit  ...            32      0
14  3038             Ashi 123  ...             0      0
15  3037        shivamvermaa4  ...            10      0
16  3036               Sophia  ...            45

https://www.appsloveworld.com/download-sample-json-file-with-multiple-records/

In [14]:
df = pd.read_json('/content/EmployeeData.json')
print(df.to_string()) 

      id                 name                               email               password                                                                                                                                                                                                                                                            about                                 token  country                                                                                                                                   location     lng       lat                    dob  gender  userType  userStatus                                    profilePicture                                     coverPicture  enablefollowme  sendmenotifications  sendTextmessages  enabletagging                    createdAt                    updatedAt    livelng    livelat                                                                                        liveLocation  creditBalance  myCash
0   4051                m

**Dictionary as JSON**
* If your JSON code is not in a file, but in a Python Dictionary, you can load it into a DataFrame directly:

In [15]:
# Load a Python Dictionary into a DataFrame:
data = {
  "Duration":{
    "0":60,
    "1":60,
    "2":60,
    "3":45,
    "4":45,
    "5":60
  },
  "Pulse":{
    "0":110,
    "1":117,
    "2":103,
    "3":109,
    "4":117,
    "5":102
  },
  "Maxpulse":{
    "0":130,
    "1":145,
    "2":135,
    "3":175,
    "4":148,
    "5":127
  },
  "Calories":{
    "0":409,
    "1":479,
    "2":340,
    "3":282,
    "4":406,
    "5":300
  }
}

df = pd.DataFrame(data)

print(df) 

   Duration  Pulse  Maxpulse  Calories
0        60    110       130       409
1        60    117       145       479
2        60    103       135       340
3        45    109       175       282
4        45    117       148       406
5        60    102       127       300
