# Effective code in Python
Means code that executes faster and doesn't overconsume RAM
Code that has minimum runtime and small memory footprint


## Measuring exection time or runtime

### Measure execution time with %timeit

`%timeit <single line code expression here>`
    
`%%timeit <multi line code expression here>
    <code line 2>
    <code line 3>`

Using parameters, we can specify number of runs and loops within each run to measure execution time
`%timeit -r5 -n25 set(range(51)) #r - number of runs; n - number of loops for each run`


### Measuring runtime with 'line_profiler' package
It can be installed using pip
`pip install line_profiler`

To start using it, we need to load it first:
`%load_ext line_profiler`

After it, we can use it to measure different blocks of code, such as functions and so on

Measure runtime of function called 'convert_units'
`%lprun -f <function_name> function_name(function_arguments)`
`%lprun -f convert_units convert_units(heroes, hts, wts)`


## Measuring memory footprint (memory allocation)

Dirty approach using `sys` module and its `getsizeof()` method showing utilized memory in bytes.
It allows to see consumption for individual object. It is not suitable for line-by-line footprint identifying.

There is a package called `memory_profiler`
Install:
`pip install memory_profiler`

Using:
`%load_ext memory_profiler`
`%mprun -f <function_name> function_name(function_arguments)`
`%mprun -f convert_units convert_units(heroes, hts, wts)`

Can be used only to evaluate Python code stored in files.


# Working with sets

Sets work faster than lists or tuples in cases we need to find:
 - elements that are in two sets (intersection)
 - elements that are only in one set (difference)
 - combine elements from both sets
 - find unique elements

This is when it's more beneficial in terms of runtime to use set methods:
 - set_a.intersection(set_b) - elements that are present in both set_a and set_b
 - set_a.difference(set_b) - elements that are present in set_a only
 - set_a.union(set_b) - all distinct elements from set_a and set_b

Membership testing (if elements is in the iterable) works a way faster with sets
 - 'element' in set_a


## Avoiding loops

The best approach is to avoid using loops by leveraging one-line expressions, such as list comprehensions or special imported modules or using sets.


## Iterating in Pandas

### Use `.iterrows()` to iterate over rows

Example:
```
# Iterate over pit_df and print each row
for i,row in pit_df.iterrows():
    print(row)
```

Example 2:
```
# Use one variable instead of two to store the result of .iterrows()
for row_tuple in pit_df.iterrows():
    print(row_tuple)
```

### Use `.itertuples()` to iterate over tuples
`.itertuples()` returns each DataFrame row as a special data type called a namedtuple. You can look up an attribute within a namedtuple with a special syntax.

Example:
```
# Loop over the DataFrame and print each row
for row in rangers_df.itertuples():
  print(row)
```

For each row in Pandas dataframe, it return structure like:
`Pandas(Index=0, Team='TEX', League='AL', Year=2012, RS=808, RA=707, W=93, G=162, Playoffs=1)`

Example 2:
```
# Loop over the DataFrame and print each row's Index, Year and Wins (W)
for row in rangers_df.itertuples():
  i = row.Index
  year = row.Year
  wins = row.W
  print(i, year, wins)
```

### Use pandas .apply() method

As `map()` method allows to use the specified funtion to all values in iterable, `apply()` allows to use the function to the dataframe accessing df either by rows (`axis = 1`) or by columns (`axis = 0`)

Example:
```
# Gather sum of all columns in dataframe
stat_totals = df.apply(sum, axis=0)
print(stat_totals)
```

Example 2:
```
# Gather total runs scored in all games per year
total_runs_scored = rays_df[['RS', 'RA']].apply(sum, axis=1)
print(total_runs_scored)
```

Example 3:
```
# Create a win percentage Series 
win_percs = dbacks_df.apply(lambda row: calc_win_perc(row['W'], row['G']), axis=1)
print(win_percs, '\n')
```

### Use pandas .values() method

Pandas is built on top of numpy.
When we use `df['column'].values()`, it has type of np.array. That means we can apply broadcasting feature here, that is very efficient.

For example:
`baseball_df['RS'].values() - baseball_df['RA'].values()`

