# Multiplayer Metaflow

## Sharing

Data scientist's on the same Metaflow deployment have read access to each other's Metaflow results. 
This is useful in a project because you can make incremental progress on each other results, without reinventing the wheel.
You can also compare approaches, results, and split work in a seamless way.

Two concepts and sharing features that you should understand to benefit most from Metaflow in a team setting, are namespaces and projects.

## Namespaces

[Namespaces](https://docs.metaflow.org/scaling/tagging#namespaces) help you to keep results organized. 
A flow runs in a namespace, and the Metaflow Client API retrieves results for you based on the active namespace.
Namespaces are not about security or access control, but for organizing results by the users who produce and analyze them.

In [1]:
from metaflow import namespace, Metaflow

In [2]:
namespace('user:sandbox')
Metaflow().flows

[Flow('RF_Flow_cloud'),
 Flow('Branch_Flow_Cloud'),
 Flow('Branch_Cloud_Step'),
 Flow('Branch_Cloud_Flow'),
 Flow('DivideByZeroFlow'),
 Flow('RetryFlow'),
 Flow('TimeoutFlow'),
 Flow('CatchDivideByZeroFlow'),
 Flow('TaxiFarePrediction')]

In [3]:
# Expect: the same result in this global (None) namespace as the user:sandbox namespace, 
    # because your sandbox user is the only user on this Metaflow deployment.
namespace(None)
Metaflow().flows

[Flow('RF_Flow_cloud'),
 Flow('Branch_Flow_Cloud'),
 Flow('Branch_Cloud_Step'),
 Flow('Branch_Cloud_Flow'),
 Flow('DivideByZeroFlow'),
 Flow('RetryFlow'),
 Flow('TimeoutFlow'),
 Flow('CatchDivideByZeroFlow'),
 Flow('TaxiFarePrediction')]

## The production namespace
There is a special namespace designated production. Production namespaces are important for a variety of reasons. For example, it helps you add authorization keys to the process of deploying to a valuable Flow's production namespace. 

To get started in production, find your `TaxiFarePrediction` Flow from the week 3 project and run this command with the file path changed to wherever you place the flow in your week 4 workspace.

In [4]:
# Watch the file path if you run in the terminal! It may be easiest to do `cd ./notebooks`, if you want to run the commands as is from the terminal.
# If you run this cell from the notebook, you should be able to use the path as is.

# This will take a minute or two the first time it has to build the conda environment.
# Find the namespace this flow is deployed in. Is it production? 🧐

! python ../flows/cloud/event_triggered_linear_regression_solo.py --environment=conda argo-workflows create

[35m[1mMetaflow 2.10.6+ob(v1)[0m[35m[22m executing [0m[31m[1mTaxiFarePrediction[0m[35m[22m[0m[35m[22m for [0m[31m[1muser:sandbox[0m[35m[22m[K[0m[35m[22m[0m
[35m2023-11-12 08:11:44.794 [0m[22mCreating local datastore in current directory (/home/workspace/workspaces/full-stack-ml-metaflow-corise-week-4/notebooks/.metaflow)[0m
[35m[22mValidating your flow...[K[0m[35m[22m[0m
[32m[1m    The graph looks good![K[0m[32m[1m[0m
[35m[22mRunning pylint...[K[0m[35m[22m[0m
[32m[1m    Pylint is happy![K[0m[32m[1m[0m
[1mDeploying [0m[31m[1mtaxifareprediction[0m[1m to Argo Workflows...[K[0m[1m[0m
[22m[K[0m[22m[0m
[22mThe namespace of this production flow is[K[0m[22m[0m
[32m[22m    production:taxifareprediction-0-ouyj[K[0m[32m[22m[0m
[22mTo analyze results of this production flow add this line in your notebooks:[K[0m[22m[0m
[32m[22m    namespace("production:taxifareprediction-0-ouyj")[K[0m[32m[22m[0m
[22mIf 

In the output, you will see a line like `namespace("user:sandbox")` or `namespace("production:taxifareprediction-0-ouyj")`. 
Copy that line, paste it in the first line of the next cell and run it. Then 

In [5]:
namespace("production:taxifareprediction-0-ouyj")

'production:taxifareprediction-0-ouyj'

In [6]:
# run this cell to print your Argo URL
import json
res = json.load(open("/home/workspace/.metaflowconfig/config.json"))
print("Go to this URL:", res['SANDBOX_VSCODE_URL'].replace('vs', 'argo'), "and click on the second top left tab 'Workflow Templates'. Then come back here and trigger the flow.")

Go to this URL: https://argo-pw-323314244.outerbounds.dev and click on the second top left tab 'Workflow Templates'. Then come back here and trigger the flow.


In [7]:
! python ../flows/cloud/event_triggered_linear_regression_solo.py --environment=conda argo-workflows trigger

[35m[1mMetaflow 2.10.6+ob(v1)[0m[35m[22m executing [0m[31m[1mTaxiFarePrediction[0m[35m[22m[0m[35m[22m for [0m[31m[1muser:sandbox[0m[35m[22m[K[0m[35m[22m[0m
[35m[22mValidating your flow...[K[0m[35m[22m[0m
[32m[1m    The graph looks good![K[0m[32m[1m[0m
[35m[22mRunning pylint...[K[0m[35m[22m[0m
[32m[1m    Pylint is happy![K[0m[32m[1m[0m
[1mWorkflow [0m[31m[1mtaxifareprediction[0m[1m triggered on Argo Workflows (run-id [0m[31m[1margo-taxifareprediction-bkgkx[0m[1m).[K[0m[1m[0m
[1mSee the run in the UI at https://ui-pw-323314244.outerbounds.dev/TaxiFarePrediction/argo-taxifareprediction-bkgkx[K[0m[1m[0m


In [8]:
# This cell will return an empty list for a minute or two after Argo UI shows the flow running.
Metaflow().flows
# Do you see a different set of flows than what you saw above with the global or user namespace. 

[Flow('TaxiFarePrediction')]

## Projects
The [`@project decorator`](https://docs.metaflow.org/production/coordinating-larger-metaflow-projects) is for production use cases. 

It makes available three classes of namespaces that will affect the behavior of a production deployment:
1. `user` is the default. It will deploy to a user-specific, private namespace. Use it for testing production deployments.
2. `test` denotes custom branches that can be shared amongst multiple users. Use it for deploying experimental versions that can run in parallel with production. Deploy custom branches with `--branch foo`.
3. `prod` denotes the global production namespace. Use it for deploying the official production version of the project. Deploy to production with `--production`. For multiple production variants, deploy custom branches with `--production --branch foo`.

You don't need to remember these, but they are useful to revisit once you have thought more about the requirements of your ML project.
For example, later in this week you will consider how to deploy multiple production variants to score a champion/challenger model on production traffic.

### Motivation
Consider the situation after you deploy `TaxiFarePrediction` to Argo that is running in production. 
The next time you deploy `TaxiFarePrediction` by running the same command you just did a few cells ago,
```sh
python ../flows/cloud/event_triggered_linear_regression.py --environment=conda argo-workflows create
```
it will overwrite the production `TaxiFarePrediction` flow. Clearly, we want more optionality to run experiments.

What do you do when the workflow starts performing well, and multiple people want to test their own production deployments without interfering with yours; or if, as a single developer, you want to experiment with multiple independent deployments of your workflow? How do you create a new workflow without overwriting the existing one? 

### Going to --production

Metaflow's `@project` decorator makes it easy to specifiy the production namespace. You can use this to deploy workflows in different namespaces, and specify a dedicated production branch that is not for anything experimental.

In [9]:
! python ../flows/cloud/event_triggered_linear_regression.py --environment=conda --production argo-workflows create

[35m[1mMetaflow 2.10.6+ob(v1)[0m[35m[22m executing [0m[31m[1mTaxiFarePrediction[0m[35m[22m[0m[35m[22m for [0m[31m[1muser:sandbox[0m[35m[22m[K[0m[35m[22m[0m
[35m[22mProject: [0m[32m[1mfullstack[0m[35m[22m, Branch: [0m[32m[1mprod[0m[35m[22m[K[0m[35m[22m[0m
[35m[22mValidating your flow...[K[0m[35m[22m[0m
[32m[1m    The graph looks good![K[0m[32m[1m[0m
[35m[22mRunning pylint...[K[0m[35m[22m[0m
[32m[1m    Pylint is happy![K[0m[32m[1m[0m
[1mDeploying [0m[31m[1mfullstack.prod.taxifareprediction[0m[1m to Argo Workflows...[K[0m[1m[0m
[22mIt seems this is the first time you are deploying [0m[31m[1mfullstack.prod.taxifareprediction[0m[22m to Argo Workflows.[K[0m[22m[0m
[22m[K[0m[22m[0m
[22mA new production token generated.[K[0m[22m[0m
[22m[K[0m[22m[0m
[22mThe namespace of this production flow is[K[0m[22m[0m
[32m[22m    production:mfprj-a7y3zg2tm4k4wauw-0-simf[K[0m[32m[22m[0m
[22

In [11]:
namespace("production:mfprj-a7y3zg2tm4k4wauw-0-simf")

'production:mfprj-a7y3zg2tm4k4wauw-0-simf'

In [13]:
! python ../flows/cloud/event_triggered_linear_regression.py --environment=conda --production argo-workflows trigger

[35m[1mMetaflow 2.10.6+ob(v1)[0m[35m[22m executing [0m[31m[1mTaxiFarePrediction[0m[35m[22m[0m[35m[22m for [0m[31m[1muser:sandbox[0m[35m[22m[K[0m[35m[22m[0m
[35m[22mProject: [0m[32m[1mfullstack[0m[35m[22m, Branch: [0m[32m[1mprod[0m[35m[22m[K[0m[35m[22m[0m
[35m[22mValidating your flow...[K[0m[35m[22m[0m
[32m[1m    The graph looks good![K[0m[32m[1m[0m
[35m[22mRunning pylint...[K[0m[35m[22m[0m
[32m[1m    Pylint is happy![K[0m[32m[1m[0m
[1m    Argo Workflows error[0m[22m:[K[0m[22m[0m
[22m    The workflow [0m[31m[1mfullstack.prod.taxifareprediction[0m[22m doesn't exist on Argo Workflows in namespace [0m[31m[1mjobs-pw-323314244[0m[22m. Please deploy your flow first.[K[0m[22m[0m
[22m[K[0m
