# Weight & Biases: Runs, Experiments and Uploading Artifacts

This notebook shows very basic but useful examples of how projects, runs and artifacts are created locally and registered/uploaded to the W&B registry.

In order to use it, you need to have a Weight & Biases account; then run `wandb login` and log in via the web.

The commands shown here have an effect in the projects on our W&B account, accessible via the web. Thus, always check interatcively the W&B web interface to see the changes.

Whenever we execute `wandb.init()`, a `wandb` folder is created with W&B stuff; I add that folder to `.gitignore`.

Note that in my W&B account I created a group `datamix-ai`, which appars in the package output; however, I'm logged as `mxagar`. As far as I know, that has no effect.

### Overview of Contents

1. Create a file to be an artifact and instantiate a run
2. Instantiate an artifact, attach the file to it and attach the artifact to the run
3. Change the file and re-attach to artifact & run
4. Using runs with context managers

## 1. Create a file to be an artifact and instantiate a run

In [1]:
# We must be logged in: $ wandb login
import wandb

In [13]:
# Create a file
with open("my_artifact.txt", "w+") as fp:
    fp.write("This is an example of an artifact.")

In [14]:
# Check that the file is in the local directory
!ls

01_WandB_Upload_Artifact.ipynb my_artifact.txt
my_artifact                    [1m[36mwandb[m[m


In [15]:
# Instantiate a run
run = wandb.init(project="demo_artifact",
                 group="experiment_1")

VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

Now, we go to the W&B page and look for the project: [https://wandb.ai/mxagar/projects](https://wandb.ai/mxagar/projects).

We will fin the project, from which hang the `experiment` and the `run` with the automatic name `eternal-planet-1`.

In Jupyter, we also get a link to the run when we execute a run with `wandb.init()`.

In [22]:
# To check wand object and function options
#wandb.init?
#wandb.Artifact?

## 2. Instantiate an artifact, attach the file to it and attach the artifact to the run

In [17]:
# Instantiate an artifact
artifact = wandb.Artifact(
    name="my_artifact.txt", # does not need to be the name of the file
    type="data", # this is to group artifacts together
    description="This is an example of an artifact",
    metadata={ # metadata is an optional dictionary; we can use it for searching later on
        "key_1":"value_1"
    }
)

In [18]:
# We attach a file to the artifact; we can attach several files!
artifact.add_file("my_artifact.txt")

<ManifestEntry digest: bPkpOLyTUhHg8TmNSBWd9g==>

In [19]:
# We attach the artifact to the run
run.log_artifact(artifact)

<wandb.sdk.wandb_artifacts.Artifact at 0x7fd4033073d0>

The fact that we attached the artuufact to the run doesn't mean that it has been uploaded to the W&B registry. W&B uploads stuff whenever we close a run (e.g., when exiting the notebook) or every a certain amount of time (auto-upload).

In [20]:
# We can manually finish the run to force W&B upload the artifacts
# We cannot use the run object anymore after finish()
run.finish()

VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

Now, we can check that the artifact is on the W&D web interface: [https://wandb.ai/mxagar/projects](https://wandb.ai/mxagar/projects) / `choose project` / `Artifacts icon`

## 3. Change the file and re-attach to artifact & run

When we change and re-attach the file, we will have a new version in the W&B web interface. However, a new version is registered **only** if the file has changed!

In [24]:
# Change the file
with open("my_artifact.txt", "w+") as fp:
    fp.write("This is an example of an artifact changed.")

# Instantiate a run
run = wandb.init(project="demo_artifact",
                 group="experiment_1")

# Instantiate an artifact
artifact = wandb.Artifact(
    name="my_artifact.txt", # does not need to be the name of the file
    type="data", # this is to group artifacts together
    description="This is an example of an artifact",
    metadata={ # metadata is an optional dictionary; we can use it for searching later on
        "key_1":"value_1"
    }
)

# We attach a file to the artifact; we can attach several files!
artifact.add_file("my_artifact.txt")
run.log_artifact(artifact)

<wandb.sdk.wandb_artifacts.Artifact at 0x7fd4032eefd0>

In [25]:
# We can manually finish the run to force W&B upload the artifacts
run.finish()

VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

## 4. Using runs with context managers

If we use contexts, it's easier to use several runs. Several runs make sense, for instance, when we're doing hyperparameter tuning. We don't need to do `run.finish()`, since that's handle by the context manager.

In [26]:
with wandb.init(project="demo_artifact", group="experiment_1") as run:

    with open("my_artifact.txt", "w+") as fp:
        fp.write("This is an example of an artifact.")

    artifact = wandb.Artifact(
        name="my_artifact.txt", # does not need to be the name of the file
        type="data", # this is to group artifacts together
        description="This is an example of an artifact",
        metadata={ # metadata is an optional dictionary; we can use it for searching later on
            "key_1":"value_1"
        }
    )
    
    artifact.add_file("my_artifact.txt")

with wandb.init(project="demo_artifact", group="experiment_1") as run:

    with open("my_artifact.txt", "w+") as fp:
        fp.write("This is an example of an artifact changed again.")

    artifact = wandb.Artifact(
        name="my_artifact.txt", # does not need to be the name of the file
        type="data", # this is to group artifacts together
        description="This is an example of an artifact",
        metadata={ # metadata is an optional dictionary; we can use it for searching later on
            "key_1":"value_1"
        }
    )
    
    artifact.add_file("my_artifact.txt")

VBox(children=(Label(value='0.000 MB of 0.000 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

VBox(children=(Label(value='0.000 MB of 0.000 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…