We can apply basic arithmetic operations such as addition, subtraction, multiplication, and division to pandas Series and DataFrame objects.

We can use this technique to insert a new column to a pandas DataFrame. For example, try calculating a total score as a linear combination of your candidates’ Python, Django, and JavaScript scores:

In [7]:
import os
import numpy as np
import pandas as pd

file_path = os.path.join('resources','candidate.csv')

df = pd.read_csv(file_path,index_col=0)

print(df)

# Adding Column django and javascript scores

df['js-score'] = np.array([71.0, 95.0, 88.0, 79.0, 91.0, 91.0, 80.0])
df.insert(loc=4,column='django-score',value=np.array([86.0, 81.0, 78.0, 88.0, 74.0, 70.0, 81.0]))
print()
print('*** Adding Column django-score and js-score')
print()
print(df)

print()
print('Add new column by calculating total score from py-score,django-score and js-score')
print()

df['total_score'] = 0.4 * df['py-score'] + 0.3 * df['django-score'] + 0.3 * df['js-score']

print(df)



       name         city  age  py-score
101  Xavier  Mexico City   41      88.0
102     Ann      Toronto   28      79.0
103    Jana       Prague   33      81.0
104      Yi     Shanghai   34      80.0
105   Robin   Manchester   38      68.0
106    Amal        Cairo   31      61.0
107    Nori        Osaka   37      84.0

*** Adding Column django-score and js-score

       name         city  age  py-score  django-score  js-score
101  Xavier  Mexico City   41      88.0          86.0      71.0
102     Ann      Toronto   28      79.0          81.0      95.0
103    Jana       Prague   33      81.0          78.0      88.0
104      Yi     Shanghai   34      80.0          88.0      79.0
105   Robin   Manchester   38      68.0          74.0      91.0
106    Amal        Cairo   31      61.0          70.0      91.0
107    Nori        Osaka   37      84.0          81.0      80.0

Add new column by calculating total score from py-score,django-score and js-score

       name         city  age  py-scor

### Determining Data Statistics

We can get basic statistics for the numerical columns of a pandas DataFrame with ``.describe()``:

In [8]:
print('Getting general statistics for numerical column of DataFrame')

print()

print(df.describe())

Getting general statistics for numerical column of DataFrame

             age   py-score  django-score   js-score  total_score
count   7.000000   7.000000      7.000000   7.000000     7.000000
mean   34.571429  77.285714     79.714286  85.000000    80.328571
std     4.429339   9.446592      6.343350   8.544004     4.101510
min    28.000000  61.000000     70.000000  71.000000    72.700000
25%    32.000000  73.500000     76.000000  79.500000    79.300000
50%    34.000000  80.000000     81.000000  88.000000    82.100000
75%    37.500000  82.500000     83.500000  91.000000    82.250000
max    41.000000  88.000000     88.000000  95.000000    84.400000


Here, .describe() returns a new DataFrame with the number of rows indicated by count, as well as the mean, standard deviation, minimum, maximum, and quartiles of the columns.

To get particular statistics for some or all of the columns, then we can call methods such as ``.mean()`` or ``.std()``.



In [15]:
print('To get mean of py-score columns')

print()

print(df['py-score'].mean())

print()

print('To get std deviation of total-score columns')

print()

print(df['total_score'].std())

print()

print('To get std deviation of all columns')

print(df.std(numeric_only=True))


To get mean of py-score columns

77.28571428571429

To get std deviation of total-score columns

4.101509594329988

To get std deviation of all columns
age             4.429339
py-score        9.446592
django-score    6.343350
js-score        8.544004
total_score     4.101510
dtype: float64


>Note:When applied to a pandas DataFrame, these methods return Series with the results for each column. When applied to a Series object, or a single column of a DataFrame, the methods return scalars.