# Intro

This notebook supplements my [blog post](https://medium.com/@aneesh.kodali/chief-exec-officer-create-multiple-dataframes-at-once-53e256f31f01?source=friends_link&sk=73845dffdecf7bbf63d4a887644d28c5) where I outline the use of the `exec()` function and its use when reading in multiple files as dataframes.

I'll first show some examples to illustrate how `exec()` works. Recall, the steps are:

1) Create list of variables
2) Create list of values
3) Iterate through both lists to assign values to variables

# `Exec()` Examples

## Example: Calculate SumOne's Name

Let's say you want to find the 'sum' of a person's name. What I mean is:
- Assign each letter in the alphabet to some value
- Go through a name and add up all letter values for each letter in that name

In [1]:
import string

# Create list of variables
letterList = list(string.ascii_lowercase)

# Create list of values
numberList = list(range(1,27))

# Iterate through both lists to assign values
for letter, number in zip(letterList, numberList):
    exec(f"{letter} = {number}")

# Verify that variables have been created
%whos int

Variable   Type    Data/Info
----------------------------
a          int     1
b          int     2
c          int     3
d          int     4
e          int     5
f          int     6
g          int     7
h          int     8
i          int     9
j          int     10
k          int     11
l          int     12
m          int     13
n          int     14
number     int     26
o          int     15
p          int     16
q          int     17
r          int     18
s          int     19
t          int     20
u          int     21
v          int     22
w          int     23
x          int     24
y          int     25
z          int     26


Now we can find the sum of my name (in lower case)

In [2]:
name = 'aneesh'
nameSum=0
for letter in name:
    # Print each letter and its value
    exec(f"print(letter, {letter})")
    # Add letter value to nameSum
    exec(f"nameSum+={letter}")
nameSum

a 1
n 14
e 5
e 5
s 19
h 8


52

## Creating and assigning variable values to 'themselves'

Now let's say we have a list of values. For each value in our list, we want to create a variable with the same name as the value itself. We can use `exec()` to do this as well. One thing to note in this case is that, since our values are strings, we have to use quotes (single or double) when typing in the value (the right hand side of the equal sign). 

If you don't understand this, write the code for the  first iteration in the loop. It would look like:
                
```python
lion = "lion"
```
                
Remember that `exec()` evaluates a string. So, when passing in strings as the value, make sure they retain their own set of quotes. Still don't believe me? Try taking out the single quotes in the right hand side of the exec statement below. Python will think that your string "lion" is a variable called 'lion' and will throw an error because you haven't defined the variable 'lion' anywhere. It will think you're writing

'lion the variable' = 'lion the variable' 

instead of 

'lion the variable' = 'lion the string'

In [3]:
animals = ["lion", "tiger", "bear"]

for animal in animals:
    exec(f"{animal} = '{animal}'")
    
# Verify
%whos str

Variable   Type    Data/Info
----------------------------
animal     str     bear
bear       str     bear
letter     str     h
lion       str     lion
name       str     aneesh
tiger      str     tiger


# Read in Files and Create DataFrames


Data is in the 'data' folder.

Old way: write out X `pd.read_csv()` statements 

```python
bom_movie_gross = pd.read_csv('data/bom.movie_gross.csv')
imdb_name_basics = pd.read_csv('data/imdb.name.basics.csv')
imdb_title_akas = pd.read_csv('data/imdb.title.akas.csv')
...
tmdb_movies = pd.read_csv('data/tmdb.movies.csv')
tn_movie_budgets = pd.read_csv('data/tn.movie_budgets.csv')

```

New way: write out 1 `exec()` statement

In [4]:
import pandas as pd
import os

fileList = os.listdir('data')

# Using file names to create dataframe names
# Removing file extension and replacing periods with underscores
dfList = [x.replace('.'+x.split('.')[-1], '').replace('.','_') for x in fileList]

for df, file in zip(dfList, fileList):
    exec(f"{df} = pd.read_csv('data/{file}')")
        
# Verify
%whos DataFrame 

Variable                Type         Data/Info
----------------------------------------------
bom_movie_gross         DataFrame                             <...>\n[3387 rows x 5 columns]
imdb_name_basics        DataFrame               nconst        <...>[606648 rows x 6 columns]
imdb_title_akas         DataFrame             title_id  orderi<...>[331703 rows x 8 columns]
imdb_title_basics       DataFrame               tconst        <...>[146144 rows x 6 columns]
imdb_title_crew         DataFrame               tconst        <...>[146144 rows x 3 columns]
imdb_title_principals   DataFrame                tconst  order<...>1028186 rows x 6 columns]
imdb_title_ratings      DataFrame               tconst  averag<...>n[73856 rows x 3 columns]
tmdb_movies             DataFrame           Unnamed: 0        <...>[26517 rows x 10 columns]
tn_movie_budgets        DataFrame          id  release_date   <...>\n[5782 rows x 6 columns]
