### Glob

If you have multiple files in a directory, it might be better to load them through glob instead of doing it manually. Let's say you have 5 files. Doing it manually would entail:

```
first_df = pandas.read_csv("file1.csv")
second_df = pandas.read_csv("file2.csv")
third_df = pandas.read_csv("file3.csv")
fourth_df = pandas.read_csv("file4.csv")
fifth_df = pandas.read_csv("file5.csv")
```

As you can see, doing that would be a little tedious. Instead, you can do it in a loop using glob. First, we will import glob and create a list with the dataframes:

In [13]:
import glob
import pandas

iris_dataframe = pandas.DataFrame()

Secondly, we will use the `glob.iglob()` function in order to take files within a directory. Note that `*` in `*.csv` means "anything with '.csv' in it." If we wanted just file`x`.csv, we can use `file*.csv`.

In [14]:
for file in glob.glob("*.csv"):
    this_file_dataframe = pandas.read_csv(file)
    iris_dataframe = iris_dataframe.append(this_file_dataframe, ignore_index=True)
    
print(iris_dataframe)

     0.984  1.299  2.362   2.48  Iris-virginica  petal_length  petal_width  \
0      NaN    NaN    NaN    NaN             NaN           1.4          0.2   
1      NaN    NaN    NaN    NaN             NaN           1.4          0.2   
2      NaN    NaN    NaN    NaN             NaN           1.3          0.2   
3      NaN    NaN    NaN    NaN             NaN           1.5          0.2   
4      NaN    NaN    NaN    NaN             NaN           1.4          0.2   
5      NaN    NaN    NaN    NaN             NaN           1.7          0.4   
6      NaN    NaN    NaN    NaN             NaN           1.4          0.3   
7      NaN    NaN    NaN    NaN             NaN           1.5          0.2   
8      NaN    NaN    NaN    NaN             NaN           1.4          0.2   
9      NaN    NaN    NaN    NaN             NaN           1.5          0.1   
10     NaN    NaN    NaN    NaN             NaN           1.5          0.2   
11     NaN    NaN    NaN    NaN             NaN           1.6   

Now, your dataframes are accessible within the list. If you want to access them, you just need to call the index of the dataframe that you want. Printing a sample from the dataframes at a specific index yields you:

In [8]:
print(iris_dataframe.sample(3)) # In the python_Introduction directory, this would be iris1.csv

     sepal_length  sepal_width  petal_length  petal_width         species
6        4.600000     3.400000      1.400000     0.300000     Iris-setosa
133      2.480315     1.102362      2.007874     0.590551  Iris-virginica
24       4.800000     3.400000      1.900000     0.200000     Iris-setosa


In [12]:
iris_dataframe['species'].unique()

array(['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'], dtype=object)