# Writing efficient Python code 

In this project, we will explore how to write a Python program that is readable while at the same time efficient with the fast runtime and the minimal memory usage. 

A part of the main Hipparcos catalog was extracted into the Hp_Sp.csv file.  
Hip_Sp.csv contains four columns: 
<ul>
      <li> Hip_No -- unique Hipparcos number </li>
      <li> Vmag -- visual magnitude as a measure of stellar apparent brightness </li> 
      <li> Mv -- absolute stellar magnitude, is a measure of the real steller brightnes and it is calculated from the Hipparcos aparent visual magnitude (Vmag) and the Hipparcos measured parallax (Plx). </li>
      <li> Spectral_type -- is a measure of stellar temperature or color. </li>
</ul>

### Importing data 

In [1]:
%%time
import numpy as np
import pandas as pd

file = '../data/Hip_Sp.csv'
new_column_names = ['Hip_No', 'Vmag', 'Mv', 'Spectral_type']
hip_sp = pd.read_csv(file, header = 0, sep=',',
                  usecols=[1,2,3,4],
                  names=new_column_names)
hip_sp.head(5)

Wall time: 491 ms


Unnamed: 0,Hip_No,Vmag,Mv,Spectral_type
0,1,9.1,1.845016,F5
1,2,9.27,5.972221,K3V
2,3,6.61,-1.146468,B9
3,4,8.06,2.506509,F0V
4,5,8.55,0.839409,G8III


###  Pythonic .vs. non-pythonic code

How many stars from our Hp_Sp.csv file are more luminous than the Sun, knowing that the absolute magnitude of the Sun is 4.83? To answer this question,  we need to count the number of stars from the Mv column of the hip_sp data frame. All-stars from the catalog with the absolute magnitudes, Mv, less than 4.83, are more luminous than our Sun.  

In [2]:
%%time

#Non-Pythonic Way

star_list = []
for i in range(0,len(hip_sp['Mv'])):
    mag = hip_sp['Mv'][i]
    if mag < 4.83:
       star_list.append(mag)

print(len(star_list))

104597
Wall time: 543 ms


In [3]:
%%time

#Pythonic Way

star_list = [mag for mag in hip_sp['Mv'] if mag < 4.83]

print(len(star_list))

104597
Wall time: 20 ms


### Examining runtime 

To select the most efficient code we will examine the runtime using one of the magic commands. The module timeit will time many executions for one statement. We can set the number of runs using -r option and the number of loops using -n option. On the other hand, the %%time command measures actual time to complete a command and it can be affected by any other operations in the computer. 

In [4]:
import timeit

%timeit star_list = [mag for mag in hip_sp['Mv'] if mag < 4.83]

%timeit -r2 -n10 star_list = [mag for mag in hip_sp['Mv'] if mag < 4.83]

13.7 ms ± 294 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
14.6 ms ± 149 µs per loop (mean ± std. dev. of 2 runs, 10 loops each)


For example, we can compare times that takes to create a list by using the standard syntax [] or by using Python's built-in function list().

In [5]:
%timeit -r2 -n10 Mv_list1 = [hip_sp['Mv']]

%timeit -r2 -n10 Mv_list2 = list(hip_sp['Mv'])

2.46 µs ± 905 ns per loop (mean ± std. dev. of 2 runs, 10 loops each)
10.9 ms ± 516 µs per loop (mean ± std. dev. of 2 runs, 10 loops each)


###  List of Hipparacos numbers for different stars 

Let's create a list of Hip Ids and an indexed list of absolute magnitudes using Python's built-in functions.

In [6]:
%%time
hip_id_list = list(hip_sp['Hip_No'])

hip_id_list1 = [* range(1, hip_id_list[-1])]
print(len(hip_id_list1))

118321
Wall time: 12.3 ms


In [7]:
%%time
mag_list = list(hip_sp['Mv'])

indexed_list = [* enumerate(mag_list, 1)]
print(indexed_list[0])

(1, 1.8450163101289387)
Wall time: 24 ms


### Rounding values using  dataframes

In [8]:
%%time

hip_sp2 = hip_sp.round({'Mv': 2})
print(hip_sp2.head(5))

   Hip_No  Vmag    Mv Spectral_type
0       1  9.10  1.85  F5          
1       2  9.27  5.97  K3V         
2       3  6.61 -1.15  B9          
3       4  8.06  2.51  F0V         
4       5  8.55  0.84  G8III       
Wall time: 14 ms


In [9]:
%%time

Mv_list = round(hip_sp['Mv'], 2)
print(Mv_list[0:5])

0    1.85
1    5.97
2   -1.15
3    2.51
4    0.84
Name: Mv, dtype: float64
Wall time: 12 ms


### Using NumPy array 

Using NumPy arrays is the most efficient way of applying complex calculations on a set of numbers. 

In [10]:
%%time
#list of right ascension in degrees

alpha_list = [*range(1,360,1)]
alpha_np = np.array(alpha_list)
alpha_np_c = np.cos(alpha_np)*np.sin(alpha_np)
print(alpha_np_c[0:10])

[ 0.45464871 -0.37840125 -0.13970775  0.49467912 -0.27201056 -0.26828646
  0.49530368 -0.14395166 -0.37549362  0.45647263]
Wall time: 0 ns


### Combining objects

We will combine the list of stellar absolute magnitudes with the list of stellar spectral types and try to find the most efficient way of combining two objects. Using zip method is more efficient than using for loop. 

In [11]:
%%time

Mv_list = hip_sp['Mv']
Sp_list = hip_sp['Spectral_type']

star_infos = []
for i,magnitud in enumerate(Mv_list):
    star_infos.append((magnitud, Sp_list[i]))

print(type(star_infos)) 
print(star_infos[0:3])

<class 'list'>
[(1.8450163101289387, 'F5          '), (5.972220574200591, 'K3V         '), (-1.1464684004746015, 'B9          ')]
Wall time: 404 ms


In [12]:
%%time

Mv_list = hip_sp['Mv']
Sp_list = hip_sp['Spectral_type']

star_infos_zip = zip(Mv_list, Sp_list)
star_infos_zip_list = [* star_infos_zip]

print(type(star_infos_zip_list))
print(star_infos_zip_list[0:3])

<class 'list'>
[(1.8450163101289387, 'F5          '), (5.972220574200591, 'K3V         '), (-1.1464684004746015, 'B9          ')]
Wall time: 36 ms
