### Glob

If you have multiple files in a directory, it might be better to load them through glob instead of doing it manually. Let's say you have 5 files. Doing it manually would entail:

```
first_df = pandas.read_csv("file1.csv")
second_df = pandas.read_csv("file2.csv")
third_df = pandas.read_csv("file3.csv")
fourth_df = pandas.read_csv("file4.csv")
fifth_df = pandas.read_csv("file5.csv")
```

As you can see, doing that would be a little tedious. Instead, you can do it in a loop using glob. First, we will import glob and create a list with the dataframes:

In [1]:
import glob
import pandas

iris_dataframe = pandas.DataFrame()

Secondly, we will use the `glob.iglob()` function in order to take files within a directory. Note that `*` in `*.csv` means "anything with '.csv' in it." If we wanted just file`x`.csv, we can use `file*.csv`.

In [2]:
for file in glob.glob("*.csv"):
    this_file_dataframe = pandas.read_csv(file)
    iris_dataframe = iris_dataframe.append(this_file_dataframe, ignore_index=True)
    
print(iris_dataframe)

     sepal_length  sepal_width  petal_length  petal_width         species
0           5.100        3.500         1.400        0.200     Iris-setosa
1           4.900        3.000         1.400        0.200     Iris-setosa
2           4.700        3.200         1.300        0.200     Iris-setosa
3           4.600        3.100         1.500        0.200     Iris-setosa
4           5.000        3.600         1.400        0.200     Iris-setosa
5           5.400        3.900         1.700        0.400     Iris-setosa
6           4.600        3.400         1.400        0.300     Iris-setosa
7           5.000        3.400         1.500        0.200     Iris-setosa
8           4.400        2.900         1.400        0.200     Iris-setosa
9           4.900        3.100         1.500        0.100     Iris-setosa
10          5.400        3.700         1.500        0.200     Iris-setosa
11          4.800        3.400         1.600        0.200     Iris-setosa
12          4.800        3.000        

Now, your dataframes are accessible within the list. If you want to access them, you just need to call the index of the dataframe that you want. Printing a sample from the dataframes at a specific index yields you:

In [3]:
print(iris_dataframe.sample(3)) # In the python_Introduction directory, this would be iris1.csv

    sepal_length  sepal_width  petal_length  petal_width          species
90           5.5          2.6           4.4          1.2  Iris-versicolor
72           6.3          2.5           4.9          1.5  Iris-versicolor
50           7.0          3.2           4.7          1.4  Iris-versicolor


In [13]:
import numpy as np


cols = iris_dataframe.columns.values[:-1]
print(cols)

# selection = iris_dataframe.loc[iris_dataframe['species'].str.lower() == 'iris-virginica'][cols]

# selection = selection * 2.54 ** 3
for i in cols:
    iris_dataframe[i] = np.where(iris_dataframe['species'].str.lower() == 'iris-virginica',iris_dataframe[i] * 2.54, iris_dataframe[i])
print(iris_dataframe)

['sepal_length' 'sepal_width' 'petal_length' 'petal_width']
     sepal_length  sepal_width  petal_length  petal_width         species
0         5.10000      3.50000       1.40000      0.20000     Iris-setosa
1         4.90000      3.00000       1.40000      0.20000     Iris-setosa
2         4.70000      3.20000       1.30000      0.20000     Iris-setosa
3         4.60000      3.10000       1.50000      0.20000     Iris-setosa
4         5.00000      3.60000       1.40000      0.20000     Iris-setosa
5         5.40000      3.90000       1.70000      0.40000     Iris-setosa
6         4.60000      3.40000       1.40000      0.30000     Iris-setosa
7         5.00000      3.40000       1.50000      0.20000     Iris-setosa
8         4.40000      2.90000       1.40000      0.20000     Iris-setosa
9         4.90000      3.10000       1.50000      0.10000     Iris-setosa
10        5.40000      3.70000       1.50000      0.20000     Iris-setosa
11        4.80000      3.40000       1.60000      0.