### Glob

If you have multiple files in a directory, it might be better to load them through glob instead of doing it manually. Let's say you have 5 files. Doing it manually would entail:

```
first_df = pandas.read_csv("file1.csv")
second_df = pandas.read_csv("file2.csv")
third_df = pandas.read_csv("file3.csv")
fourth_df = pandas.read_csv("file4.csv")
fifth_df = pandas.read_csv("file5.csv")
```

As you can see, doing that would be a little tedious. Instead, you can do it in a loop using glob. First, we will import glob and create a list with the dataframes:

In [6]:
import glob
import pandas

iris_dataframe = pandas.DataFrame()

Secondly, we will use the `glob.iglob()` function in order to take files within a directory. Note that `*` in `*.csv` means "anything with '.csv' in it." If we wanted just file`x`.csv, we can use `file*.csv`.

In [7]:
for file in glob.glob("*.csv"):
    this_file_dataframe = pandas.read_csv(file)
    iris_dataframe = iris_dataframe.append(this_file_dataframe, ignore_index=True)
    
print(iris_dataframe)

     sepal_length  sepal_width  petal_length  petal_width         species
0        5.100000     3.500000      1.400000     0.200000     Iris-setosa
1        4.900000     3.000000      1.400000     0.200000     Iris-setosa
2        4.700000     3.200000      1.300000     0.200000     Iris-setosa
3        4.600000     3.100000      1.500000     0.200000     Iris-setosa
4        5.000000     3.600000      1.400000     0.200000     Iris-setosa
5        5.400000     3.900000      1.700000     0.400000     Iris-setosa
6        4.600000     3.400000      1.400000     0.300000     Iris-setosa
7        5.000000     3.400000      1.500000     0.200000     Iris-setosa
8        4.400000     2.900000      1.400000     0.200000     Iris-setosa
9        4.900000     3.100000      1.500000     0.100000     Iris-setosa
10       5.400000     3.700000      1.500000     0.200000     Iris-setosa
11       4.800000     3.400000      1.600000     0.200000     Iris-setosa
12       4.800000     3.000000      1.

Now, your dataframes are accessible within the list. If you want to access them, you just need to call the index of the dataframe that you want. Printing a sample from the dataframes at a specific index yields you:

In [8]:
print(iris_dataframe.sample(3)) # In the python_Introduction directory, this would be iris1.csv

     sepal_length  sepal_width  petal_length  petal_width         species
6        4.600000     3.400000      1.400000     0.300000     Iris-setosa
133      2.480315     1.102362      2.007874     0.590551  Iris-virginica
24       4.800000     3.400000      1.900000     0.200000     Iris-setosa


In [12]:
iris_dataframe['species'].unique()

array(['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'], dtype=object)