Das Noteboook stammt aus diesem Repository: https://github.com/Azure/azureml-examples/blob/main/sdk/python/using-mltable/from-paths-example/from-paths-example.ipynb
Das zugehörige Tutorial: https://learn.microsoft.com/en-us/azure/machine-learning/how-to-mltable

# 🛣️ Creating a Table from paths

You can create a table containing the paths on cloud storage. In this example, there are some dog and cat images stored in cloud storage in the following folder structure:

```
/pet-images
  /cat
    0.jpeg
    1.jpeg
    ...
  /dog
    0.jpeg
    1.jpeg
```

MLTable can extract the storage URIs of these images and the useful folder names for labelling purposes.

## 📦 Install dependencies

Ensure you have the latest MLTable library and dependencies.

In [None]:
%pip install -r ./mltable-requirements.txt

## 🐍 Create an MLTable using the Python SDK

Here you build your data loading steps using the `mltable` Python SDK. The `show()` method allows you to see the effect of the data loading transformation.

In [19]:
import mltable

# create paths to the data files
paths = []

for i in range(900):
    paths.append(
        {
            "file": f"https://raw.githubusercontent.com/Tracer1337/praxisprojekt-fom/main/FullIJCNN2013/{i:05}.jpg"
        }
    )

# create the mltable
tbl = mltable.from_paths(paths)

tbl.show()

Unnamed: 0,Path
0,https://raw.githubusercontent.com/Tracer1337/p...
1,https://raw.githubusercontent.com/Tracer1337/p...
2,https://raw.githubusercontent.com/Tracer1337/p...
3,https://raw.githubusercontent.com/Tracer1337/p...
4,https://raw.githubusercontent.com/Tracer1337/p...
5,https://raw.githubusercontent.com/Tracer1337/p...
6,https://raw.githubusercontent.com/Tracer1337/p...
7,https://raw.githubusercontent.com/Tracer1337/p...
8,https://raw.githubusercontent.com/Tracer1337/p...
9,https://raw.githubusercontent.com/Tracer1337/p...


### 🐼 Load into a Pandas data frame

You can load your Azure ML Table into Pandas using:

In [29]:
df = tbl.to_pandas_dataframe()
df.iloc[0].Path

StreamInfo[Http](https://raw.githubusercontent.com/Tracer1337/praxisprojekt-fom/main/FullIJCNN2013/00000.jpg)

### 💾 Save data loading steps 
Next, you'll save all your data loading steps into an `MLTable` file. This allows you to *reproduce* your Pandas data frame at a later point in time without having to redefine the data loading steps in your code.

In [31]:
# save the data loading steps in an MLTable file
tbl.save("./images")

#### 🔍 View the saved file

In the next code cell, we show you the `MLTable` file so you can understand how the data loading steps are serialized into a file.

In [32]:
with open("./images/MLTable", "r") as f:
    print(f.read())

paths:
- file: https://raw.githubusercontent.com/Tracer1337/praxisprojekt-fom/main/FullIJCNN2013/00000.jpg
- file: https://raw.githubusercontent.com/Tracer1337/praxisprojekt-fom/main/FullIJCNN2013/00001.jpg
- file: https://raw.githubusercontent.com/Tracer1337/praxisprojekt-fom/main/FullIJCNN2013/00002.jpg
- file: https://raw.githubusercontent.com/Tracer1337/praxisprojekt-fom/main/FullIJCNN2013/00003.jpg
- file: https://raw.githubusercontent.com/Tracer1337/praxisprojekt-fom/main/FullIJCNN2013/00004.jpg
- file: https://raw.githubusercontent.com/Tracer1337/praxisprojekt-fom/main/FullIJCNN2013/00005.jpg
- file: https://raw.githubusercontent.com/Tracer1337/praxisprojekt-fom/main/FullIJCNN2013/00006.jpg
- file: https://raw.githubusercontent.com/Tracer1337/praxisprojekt-fom/main/FullIJCNN2013/00007.jpg
- file: https://raw.githubusercontent.com/Tracer1337/praxisprojekt-fom/main/FullIJCNN2013/00008.jpg
- file: https://raw.githubusercontent.com/Tracer1337/praxisprojekt-fom/main/FullIJCNN2013/000

## ♻️ Reproduce data loading steps

Now that the data loading steps have been serialized into a file, you can reproduce them at any point in time using the `load()` method. This means you do not need to redefine your data loading steps in code and makes it easier to share with others.

In [35]:
import mltable

# load the previously saved MLTable file
tbl = mltable.load("./images")
df = tbl.to_pandas_dataframe()
df.head(5)

Unnamed: 0,Path
0,https://raw.githubusercontent.com/Tracer1337/p...
1,https://raw.githubusercontent.com/Tracer1337/p...
2,https://raw.githubusercontent.com/Tracer1337/p...
3,https://raw.githubusercontent.com/Tracer1337/p...
4,https://raw.githubusercontent.com/Tracer1337/p...
