# ⏱️ **job_scheduler_pattern**

Sample implementation pattern for multiple scheduled executions in Microsoft Fabric.

You can check more information on Medium:
- [English version](https://medium.com/@baggirraf/%EF%B8%8F-one-to-schedule-them-all-scheduling-and-executing-processes-in-microsoft-fabric-ef50da361bc0)
- [Spanish version](https://medium.com/@baggirraf/%EF%B8%8F-uno-para-programarlos-a-todos-agendado-y-ejecuci%C3%B3n-de-procesos-en-microsoft-fabric-775cfc36f720)

#### **🧷 Imports and references 🧷**

Reference to the **_sempy_functions_** notebook available at [GitHub](https://github.com/l2aFa/rafabric/blob/main/pyspark/sempy_functions.ipynb).

⚠️ This is only allowed for Spark notebooks. 

You will need to include the reference in the appropriate form or the functions themselves in the case of the Python notebook. 

Library imports and references will also be needed.

---

In [None]:
%run sempy_functions

#### **📚 Use cases and examples 📚**

Below are different examples of invoking different artefacts (pipeline, notebook, dataflow).

Choose the one that best suits your scenario to implement it and then schedule the execution of the **_"scheduler"_** notebook.

Please note:
- The use of parameters is entirely optional. If your device does not have parameters, you can omit the payload variable declaration and its inclusion in the fabric_rest_api_caller function call.
- In the specific case of Gen2 dataflows, it is necessary to enable the use of parameters that are in preliminary version, as indicated [here](https://learn.microsoft.com/en-us/fabric/data-factory/dataflow-parameters). Be aware of the limitations that this entails.
- A list of allowed values for the artifact type can be found [here](https://learn.microsoft.com/en-us/rest/api/fabric/core/items/list-items?tabs=HTTP#itemtype).
- However, it is important to point out that these values do not correspond to those allowed for the **_job_type_** value required in the API call. I have not found a list of valid values as in the previous case, but for the examples, this small mapping will suffice:

```
            artifact_type_to_job_type = {
                "DataPipeline": "Pipeline", # Fabric Data Factory pipeline
                "Dataflow": "Refresh", # Dataflow Gen2
                "Notebook": "RunNotebook" # Spark/Python notebook
                }
```
---


##### 1️⃣ Fabric Data Factory pipeline sample

In [None]:
# We can grab the workspace identifier through sempy, specify its value or we can skip it and the later function will take the default value
workspace_id = fabric.get_workspace_id()

# We look for the identifier corresponding to the artefact we want to execute on a scheduled basis
# We use the function get_artifact_guid_by_name to obtain its corresponding GUID, given that its name should not change between environments
# As a demonstration, we make the call passing values for all arguments, including the optional ones.
artifact_name = "YOUR_PIPELINE_NAME_HERE"
artifact_type = "DataPipeline"
artifact_id = get_artifact_guid_by_name(artifact_name= artifact_name, workspace= workspace_id, artifact_type=artifact_type)
if not artifact_id:
    raise Exception(f"*ERROR*: ❌ Artifact '{artifact_name}' not found in workspace '{workspace_id}.'")

# Parameters of the invoked artefact
# Can be skipped if artifact has no parameters
scheduler_sample_param_01 = "I was passed first"
scheduler_sample_param_02 = "I was second"

# Prepare arguments for the call
job_type = "Pipeline"
rest_api_uri = f"v1/workspaces/{workspace_id}/items/{artifact_id}/jobs/instances?jobType={job_type}"
# The arguments for the call can also be skipped if the artifact has no parameters as well
# The values enclosed in quotation marks within the parameters section must have the same value as the parameter names defined in the pipeline.
payload = {
    "executionData": {
        "parameters": {
            "pipeline_sample_param_01": scheduler_sample_param_01,
            "pipeline_sample_param_02": scheduler_sample_param_02
        }
    }
}

# Request artifact execution
# As a demonstration, we make the call passing values for all arguments, including the optional ones.
response = fabric_rest_api_caller(source_uri=rest_api_uri, method="post", audience="fabric", source_payload=payload)

##### 2️⃣ Spark/Python notebook sample

In [None]:
# This time we specify the workspace guid/name where the artifact lives
# We do not specify artifact_type
workspace_id = "YOUR_WORKSPACE_GUID/NAME_HERE"
artifact_name = "YOUR_NOTEBOOK_NAME_HERE"
artifact_id = get_artifact_guid_by_name(artifact_name)
if not artifact_id:
    raise Exception(f"*ERROR*: ❌ Artifact '{artifact_name}' not found in workspace '{workspace_id}.'")

scheduler_sample_param_01 = "I was passed first"
scheduler_sample_param_02 = "I was second"

job_type = "RunNotebook"
rest_api_uri = f"v1/workspaces/{workspace_id}/items/{artifact_id}/jobs/instances?jobType={job_type}"
# The values enclosed in quotation marks within the parameters section must have the same value as the parameter names defined in the notebook parameter cell.
payload = {
  "executionData": {
    "parameters": {
      "notebook_sample_param_01": {
        "value": scheduler_sample_param_01,
        "type": "string"
      },
      "notebook_sample_param_02": {
        "value": scheduler_sample_param_01,
        "type": "string"
      }
    }
  }
}

response = fabric_rest_api_caller(source_uri=rest_api_uri, method="post", audience="fabric", source_payload=payload)

##### 3️⃣ Dataflow Gen2 sample

In [None]:
# No workspace specified for the artifact
# artifact_type specified
artifact_name = "YOUR_DATAFLOW_NAME_HERE"
artifact_type = "Dataflow"
artifact_id = get_artifact_guid_by_name(artifact_name= artifact_name, artifact_type=artifact_type)
if not artifact_id:
    raise Exception(f"*ERROR*: ❌ Artifact '{artifact_name}' not found.'")

scheduler_sample_param_01 = "I was passed first"
scheduler_sample_param_02 = "I was second"

job_type = "Refresh"
rest_api_uri = f"v1/workspaces/{workspace_id}/items/{artifact_id}/jobs/instances?jobType={job_type}"
# The values enclosed in quotation marks for each parameterName attribute must have the same value and type as the parameter names defined in the dataflow parameters
# type attribute values are the same as the ones available for PowerQuery parameters
# Remember to enable the parameters check in the dataflow options
payload = {
  "executionData": {
    "parameters": [
      {
        "parameterName": "dataflow_sample_param_01",
        "type": "Text",
        "value": scheduler_sample_param_01
      },
      {
        "parameterName": "dataflow_sample_param_02",
        "type": "Text",
        "value": scheduler_sample_param_02
      }
    ]
  }
}

response = fabric_rest_api_caller(source_uri=rest_api_uri, method="post", audience="fabric", source_payload=payload)