# Getting Started with Taipy Core on Notebooks

!!! important "Supported Python versions"

    Taipy requires **Python 3.8** or newer.

Welcome to the **Getting Started Core** guide for Taipy Core. This tour shows you how to use Taipy Core to orchestrate pipelines. Taipy Core implements a modern backend for any data-driven application based on your business case.


# Taipy Core

Taipy Core is one of the components of Taipy to facilitate pipeline orchestration. There are a lot of reasons for using Taipy Core:

- Taipy Core efficiently manages the execution of your functions/pipelines.

- Taipy Core manages data sources and monitors KPIs.

- Taipy Core provides easy management of multiple pipelines and end-user scenarios, which comes in handy in the context of Machine Learning or Mathematical optimization.

Each step of the **"Getting Started Core"** will focus on basic concepts of *Taipy Core*. Note that every step is dependent on 
the code of the previous one. After completing the last step, you will have the skills to develop your own Taipy 
application. 

## Before we begin

Only Taipy has to be installed. **Taipy** package requires Python 3.8 or newer;



In [0]:
# !pip install taipy


## Using Notebooks
Using Notebooks, you **may want to restart the kernel** after a run of Taipy Core


> You can download the code of this step [here](https://docs.taipy.io/en/latest/getting_started/src/step_01.py) or all the steps [here](https://github.com/Avaiga/taipy-getting-started-core/tree/develop/src).

# Step 1: Configuration and execution

Before looking at some code examples, letâ€™s define some basic terms Taipy Core uses. Taipy Core revolves around four major concepts.

## Four fundamental concepts in Taipy Core:
- [**Data Nodes**](https://docs.taipy.io/en/latest/manuals/core/concepts/data-node/): are the translation of _variables_ in Taipy. Data Nodes don't contain the data but know how to retrieve it. They can refer to any data: any Python object (string, int, list, dict, model, data frame, etc.), a Pickle file, a CSV file, a SQL database, etc. They know how to read and write data. You can even write your own custom Data Node to access a particular data format.

- [**Tasks**](https://docs.taipy.io/en/latest/manuals/core/concepts/task/): are the translation of _functions_ in Taipy.

- [**Pipelines**](https://docs.taipy.io/en/latest/manuals/core/concepts/pipeline/): are a list of tasks executed with intelligent scheduling created automatically by Taipy. They usually represent a sequence of Tasks/functions ranging from data processing steps to simple baseline Algorithms all the way to more sophisticated pipelines: Machine-Learning, Mathematical models, Simulation, etc.

- [**Scenarios**](https://docs.taipy.io/en/latest/manuals/core/concepts/scenario/): End-Users often require modifying various parameters to reflect different business situations. Taipy Scenarios provide the framework to "run"/"execute" pipelines under different conditions/variations (i.e., data/parameters modified by the end-user)


## What is a configuration?

A [**configuration**](https://docs.taipy.io/en/latest/manuals/core/config/) is a structure to define scenarios and pipelines. It represents our Direct Acyclic Graph(s); it models the data sources, parameters, and tasks. Once defined, a configuration acts like a superclass; it is used to generate different instances of scenarios.


Let's create our first configuration. For this, we have two alternatives:

- Using Taipy Studio

- Or directly coding in Python.

Once the scenario configuration is defined, we can create instances, aka *'entities'* of scenarios, that can be submitted for execution. Entities are made from all configuration objects: scenario config, pipeline config, tasks, and data node configs. We will refer to them as _scenario entities_, _pipeline entities_, _task entities_, and _Data Node entities_. Since this is very much like the mechanism of class and instances present in object programming, we will use the word entity and instance interchangeably. 

Letâ€™s consider the simplest possible pipeline: a single function taking as input an integer and generating an integer output (doubling the input number). See below:




In [1]:
from taipy import Config
import taipy as tp

# Normal function used by Taipy
def double(nb):
    return nb * 2



<div align="center">
 <img src="https://raw.githubusercontent.com/Avaiga/taipy-getting-started-core/develop/step_01/config_01.svg" width=700>
</div>

- Two Data Nodes are being configured ('input' and 'output'). The 'input' Data Node has a _default_data_ set at 21. They will be stored as Pickle files (default storage format) and unique to each scenario entity/instance (this is the concept of scope which is covered laterâ€™). To clarify, the names given to the Data Nodes are arbitrary. The task links the two Data Nodes through the Python function double.

- The pipeline contains this task, and the scenario includes this single pipeline.





**Alternative 2:** Configuration using Python Code

Here is the code to configure a simple scenario.



In [2]:
# Configuration of Data Nodes
input_data_node_cfg = Config.configure_data_node("input", default_data=21)
output_data_node_cfg = Config.configure_data_node("output")

# Configuration of tasks
task_cfg = Config.configure_task("double",
                                 double,
                                 input_data_node_cfg,
                                 output_data_node_cfg)

# Configuration of the pipeline and scenario
pipeline_cfg = Config.configure_pipeline("my_pipeline", [task_cfg])
scenario_cfg = Config.configure_scenario("my_scenario", [pipeline_cfg])



The code below presents how you can create scenarios and submit them.

First of all, Taipy Core has to be launched(`tp.Core().run()`). It will create a service that acts as a job scheduler.

Creating a scenario/pipeline (`tp.create_scenario(<Scenario Config>)` / `tp.create_pipeline(<Pipeline Config>)`) will create all its related entities (_tasks_, _Data Nodes_, etc). These entities are being created thanks to the previous configuration. Still, no scenario has been run yet. `tp.submit(<Scenario>)` is the line of code that will run all the scenario-related pipelines and tasks. Note that a pipeline or a task can also be submitted directly (`tp.submit(<Pipeline>)`, `tp.submit(<Task>)`).



In [3]:
# Run of the Core
tp.Core().run()

# Creation of the scenario and execution
scenario = tp.create_scenario(scenario_cfg)
tp.submit(scenario)

print("Value at the end of task", scenario.output.read())



Results:

```
[2022-12-22 16:20:02,740][Taipy][INFO] job JOB_double_699613f8-7ff4-471b-b36c-d59fb6688905 is completed.
Value at the end of task 42
```

'/.data' is the default storage folder for Taipy Core. It contains data, scenarios, pipelines, jobs, and tasks. These entities can be persisted between two runs depending on how the code is run.

## Ways of executing the code: Versioning

Taipy Core provides a [versioning system](https://docs.taipy.io/en/latest/manuals/core/versioning/) to keep track of the changes that a configuration will experience over time: new data sources, new parameters, new versions of your Machine Learning engine, etc. `python main.py -h` opens a helper to understand the versioning options. Here are the principal ways to run the code with versioning:

- _Development_: is the default way of executing the code. When running a Taipy Core application in development mode, Taipy canâ€™t access all entities created from a previous Development run. Launching your Taipy code as `python main.py` executes it in Development mode.

- _Experiment_: all Taipy Core entities from the previous run are kept, but each new run will ignore them. An identifier is attached to each run. 
`python main.py --experiment` will execute the code in Experiment mode. The user can then re-execute a previous run (by selecting a previously used identifier). This version number is as follows: `python main.py --experiment 1`.

- _Production_: When running a Taipy Core application in production mode, Taipy can access all entities attached to the current or another production version. It corresponds to the case where the application is stable and running in a production environment. The user can decide the identifier to use. `python main.py --production` will execute the code in Experiment mode, or `python main.py --production 1` to run it with a specific version.


> You can download the code of this step [here](https://docs.taipy.io/en/latest/getting_started/src/step_02.py) or all the steps [here](https://github.com/Avaiga/taipy-getting-started-core/tree/develop/src).

# Step 2: Basic functions

Let's discuss some of the essential functions that come along with Taipy.

- [`<Data Node>.write(<new value>)`](https://docs.taipy.io/en/latest/manuals/core/entities/data-node-mgt/#read-write-a-data-node): this instruction changes the data of a Data Node. It also changes the _last_edit_date_ of the Data Node, influencing whether a task can be skipped.

- [`tp.get_scenarios()`](https://docs.taipy.io/en/latest/manuals/core/entities/scenario-cycle-mgt/#get-all-scenarios): this function returns the list of all the scenarios

- [`tp.get(<Taipy object ID>)`](https://docs.taipy.io/en/latest/manuals/core/entities/data-node-mgt/#get-data-node): this function returns an entity based on the id of the entity

- [`tp.delete(<Taipy object ID>)`](https://docs.taipy.io/en/latest/manuals/core/entities/scenario-cycle-mgt/#delete-a-scenario): this function deletes the entity and nested elements based on the id of the entity

## Utility of having scenarios

Taipy lets the user create multiple instances of the same configuration. Data can differ between different scenario instances. It is essential to detect/understand the difference in data between scenario instances: e.g., comparing the output/results of different instances... Such differences in behavior between different scenarios entities (from the same scenario configuration) can be due to the following:

- Changing data from input data nodes, 

- Randomness in a task (random algorithm), 

- Different values from parameters set by the end-user, etc.

The developer can directly change the data nodes entities with the _write_ function (see below).

<div align="center">
 <img src="https://raw.githubusercontent.com/Avaiga/taipy-getting-started-core/develop/step_02/config_02.svg" width=700>
</div>



In [4]:
scenario = tp.create_scenario(scenario_cfg, name="Scenario")
tp.submit(scenario)
print("Output of First submit:", scenario.output.read())



Results:

```
[2022-12-22 16:20:02,874][Taipy][INFO] job JOB_double_a5ecfa4d-1963-4776-8f68-0859d22970b9 is completed.
Output of First submit: 42
```

## _write_ function

Data of a Data Node can be changed using _write_. The syntax is `<Scenario>.<Pipeline>.<Data Node>.write(value)`. If the scenario contains a single pipeline, we can write `<Scenario>.<Data Node>.write(value)`.




In [5]:
print("Before write", scenario.input.read())
scenario.input.write(54)
print("After write",scenario.input.read())



Results:
```
Before write 21
After write 54
```

The submission of the scenario will update the output values.




In [6]:
tp.submit(scenario)
print("Second submit",scenario.output.read())


Results:
```
[2022-12-22 16:20:03,011][Taipy][INFO] job JOB_double_7eee213f-062c-4d67-b0f8-4b54c04e45e7 is completed.
Second submit 108
```

## Other useful functions

- `tp.get_scenarios` accesses all the scenarios by returning a list.



In [7]:
print([s.name for s in tp.get_scenarios()])



Results:
```
["Scenario"]
```

- Get an entity from its id:



In [8]:
scenario = tp.get(scenario.id)



- Delete an entity through its id. For example, to delete a scenario:



In [9]:
tp.delete(scenario.id)


> You can download the code of this step [here](https://docs.taipy.io/en/latest/getting_started/src/step_04.py) or all the steps [here](https://github.com/Avaiga/taipy-getting-started-core/tree/develop/src).

# Step 3: Data Node types

- *[Pickle](https://docs.taipy.io/en/latest/manuals/core/config/data-node-config/#pickle)* (default): Taipy can read and write any data that can be serializable.

- *[CSV](https://docs.taipy.io/en/latest/manuals/core/config/data-node-config/#csv)*: Taipy can read and write any data frame as a CSV.

- *[JSON](https://docs.taipy.io/en/latest/manuals/core/config/data-node-config/#json)*: Taipy can read and write any JSONable data as a JSON file.

- *[SQL](https://docs.taipy.io/en/latest/manuals/core/config/data-node-config/#sql)*: Taipy can read and write from/to a SQL table or a SQL database.

- *[Mongo](https://docs.taipy.io/en/latest/manuals/core/config/data-node-config/#mongo-collection)*: Taipy can read and write from/to a Mongo Collection

- *[Parquet](https://docs.taipy.io/en/latest/manuals/core/config/data-node-config/#parquet)*: Taipy can read and write data frames from/to a Parquet format

- *[Generic](https://docs.taipy.io/en/latest/manuals/core/config/data-node-config/#generic)*: Taipy provides a generic Data Node that can read and store any data based on a custom _reading_ and _writing_ function created by the user.

This section will use the simple DAG/execution configuration described below. The configuration consists of the following:

1. Three Data Nodes:
2. 
-   _historical data_: This is a CSV-type Data Node. It reads from a CSV file into the initial data frame. You can find the dataset used in the Getting Started [here](https://github.com/Avaiga/taipy-getting-started-core/blob/develop/src/time_series.csv).

-   _month_data_: This is a pickle Data Node. It stores in a pickle format the data frame generated by the task '_filter_' (obtained after some filtering of the initial data frame).

-   _nb_of_values_: This is also a pickle Data Node. It stores an integer generated by the '_count_values_' task.  

3. Two tasks linking these Data Nodes:

-   _filter_: filters on the current month of the data frame

-   _count_values_: calculates the number of elements in this month

5. One single pipeline in this scenario configuration grouping these two tasks.


<div align="center">
 <img src="https://raw.githubusercontent.com/Avaiga/taipy-getting-started-core/develop/step_03/config_03.svg" width=700>
</div>



In [10]:
def filter_current(df):
    current_month = dt.datetime.now().month
    df['Date'] = pd.to_datetime(df['Date']) 
    df = df[df['Date'].dt.month == current_month]
    return df

def count_values(df):
    return len(df)


In [11]:
# here is a CSV Data Node
historical_data_cfg = Config.configure_csv_data_node(id="historical_data",
                                                     default_path="time_series.csv")
month_values_cfg =  Config.configure_data_node(id="month_data")
nb_of_values_cfg = Config.configure_data_node(id="nb_of_values")


In [12]:
task_filter_cfg = Config.configure_task(id="filter_current",
                                                 function=filter_current,
                                                 input=historical_data_cfg,
                                                 output=month_values_cfg)

task_count_values_cfg = Config.configure_task(id="count_values",
                                                 function=count_values,
                                                 input=month_values_cfg,
                                                 output=nb_of_values_cfg)


In [13]:
pipeline_cfg = Config.configure_pipeline(id="my_pipeline",
                                         task_configs=[task_filter_cfg,
                                                       task_count_values_cfg])

scenario_cfg = Config.configure_scenario(id="my_scenario",
                                         pipeline_configs=[pipeline_cfg])



In [14]:
tp.Core().run()

scenario_1 = tp.create_scenario(scenario_cfg, creation_date=dt.datetime(2022,10,7), name="Scenario 2022/10/7")
scenario_1.submit()

print("Nb of values of scenario 1:", scenario_1.nb_of_values.read())



Results:

```
[2022-12-22 16:20:03,424][Taipy][INFO] job JOB_filter_current_257edf8d-3ca3-46f5-aec6-c8a413c86c43 is completed.
[2022-12-22 16:20:03,510][Taipy][INFO] job JOB_count_values_90c9b3c7-91e7-49ef-9064-69963d60f52a is completed.

Nb of values of scenario 1: 896
```


# Step 4: Cycles

[Cycles](https://docs.taipy.io/en/latest/manuals/core/concepts/cycle/) have been introduced to reflect business situations our customers frequently encounter. 

For instance, a large Fast Food chain wants to generate sales forecasts for its stores every week. When creating a given scenario, it will need to be attached to a given week. And often, a single one will be published amongst all the scenarios generated for a given week. This kind of 'official' scenario will be referred to as the 'Primary' scenario in Taipy Core.

Note that Cycles can be ignored entirely if the business problem has no time frequency. 


In this step, scenarios are attached to a MONTHLY cycle. Using Cycles, the developer will benefit from specific Taipy's functions to navigate through these Cycles. For instance, by providing the Cycle, Taipy can get all the scenarios created in a month. You can also easily get every primary scenario generated for the past X months to monitor KPIs over time.

Letâ€™s slightly change the filter function by passing the month as an argument to get started. You must create a new Data Node representing the month (see the steps below).




In [15]:
def filter_by_month(df, month):
    df['Date'] = pd.to_datetime(df['Date']) 
    df = df[df['Date'].dt.month == month]
    return df



Then to introduce Cycles, you need to set the frequency (predefined attribute) of the scenario to Monthly (as described below).

<div align="center">
 <img src="https://raw.githubusercontent.com/Avaiga/taipy-getting-started-core/develop/step_04/config_04.svg" width=700>
</div>







The configuration is the same as the last step except for the scenario and task configuration. A new parameter is added for the frequency.



In [16]:
from taipy.config import Scope

month_cfg =  Config.configure_data_node(id="month")

task_filter_cfg = Config.configure_task(id="filter_by_month",
                                             function=filter_by_month,
                                             input=[historical_data_cfg, month_cfg],
                                             output=month_values_cfg)

...

scenario_cfg = Config.configure_scenario(id="my_scenario",
                                         pipeline_configs=[pipeline_cfg],
                                         frequency=Frequency.MONTHLY)







As you can see, a Cycle is activated once you have set the desired frequency on the scenario. In this code snippet, since we have specified `frequency=Frequency.MONTHLY`, the corresponding scenario will be automatically attached to the correct period (month) once it is created. The _creation_date_ here is artificially given to the scenarios.



In [17]:
tp.Core().run()

scenario_1 = tp.create_scenario(scenario_cfg,
                                creation_date=dt.datetime(2022,10,7),
                                name="Scenario 2022/10/7")
scenario_2 = tp.create_scenario(scenario_cfg,
                                creation_date=dt.datetime(2022,10,5),
                                name="Scenario 2022/10/5")



Scenario 1 and Scenario 2 are two scenario entities/instances created from the same scenario configuration. They belong to the same Cycle but don't share the same Data Nodes. By default, each scenario instance has its own data node instances. They are not shared with any other scenario. The Scope concept can modify this behavior, which will be covered in the next step.




In [18]:
scenario_1.month.write(10)
scenario_2.month.write(10)


print("Month Data Node of Scenario 1", scenario_1.month.read())
print("Month Data Node of Scenario 2", scenario_2.month.read())

scenario_1.submit()
scenario_2.submit()




Results:
```
Month Data Node of Scenario 1 10
Month Data Node of Scenario 2 10
[2022-12-22 16:20:04,746][Taipy][INFO] job JOB_filter_by_month_a4d3c4a7-5ec9-4cca-8a1b-578c910e255a is completed.
[2022-12-22 16:20:04,833][Taipy][INFO] job JOB_count_values_a81b2f60-e9f9-4848-aa58-272810a0b755 is completed.
[2022-12-22 16:20:05,026][Taipy][INFO] job JOB_filter_by_month_22a3298b-ac8d-4b55-b51f-5fab0971cc9e is completed.
[2022-12-22 16:20:05,084][Taipy][INFO] job JOB_count_values_a52b910a-4024-443e-8ea2-f3cdda6c1c9d is completed.
[2022-12-22 16:20:05,317][Taipy][INFO] job JOB_filter_by_month_8643e5cf-e863-434f-a1ba-18222d6faab8 is completed.
[2022-12-22 16:20:05,376][Taipy][INFO] job JOB_count_values_72ab71be-f923-4898-a8a8-95ec351c24d9 is completed.
```

## Primary scenarios

In each Cycle, there is a primary scenario. A primary scenario is interesting because it represents the important scenario of the Cycle, the reference. By default, the first scenario created for a cycle will be primary.

[`tp.set_primary(<Scenario>)`](https://docs.taipy.io/en/latest/manuals/core/entities/scenario-cycle-mgt/#promote-a-scenario-as-primary) allows changing the primary scenario in a Cycle.

`<Scenario>.is_primary` identifies as a boolean whether the scenario is primary or not.



In [19]:
print("Scenario 1 before", scenario_1.is_primary)
print("Scenario 2 before", scenario_2.is_primary)

tp.set_primary(scenario_2)

print("Scenario 1 after", scenario_1.is_primary)
print("Scenario 2 after", scenario_2.is_primary)



Results:

```
Scenario 1 before True
Scenario 2 before False
Scenario 1 after False
Scenario 2 after True
```

Scenario 3 is the only scenario in another Cycle due to its creation date and is the default primary scenario.



In [20]:
scenario_3 = tp.create_scenario(scenario_cfg,
                                creation_date=dt.datetime(2021,9,1),
                                name="Scenario 2022/9/1")
scenario_3.month.write(9)
scenario_3.submit()

print("Is scenario 3 primary?", scenario_3.is_primary)



Results:

```
[2022-12-22 16:20:05,317][Taipy][INFO] job JOB_filter_by_month_8643e5cf-e863-434f-a1ba-18222d6faab8 is completed.
[2022-12-22 16:20:05,376][Taipy][INFO] job JOB_count_values_72ab71be-f923-4898-a8a8-95ec351c24d9 is completed.

Is scenario 3 primary? True
```

Also, as you can see, every scenario has been submitted and executed entirely. However, the results for these tasks are all the same. Skipping Tasks (defined in subsequent steps) will help optimize your executions by skipping the execution of redundant tasks.

## Useful functions on cycles

- `tp.get_primary_scenarios()`: returns a list of all primary scenarios

- `tp.get_scenarios(cycle=<Cycle>)`: returns all the scenarios in the Cycle

- `tp.get_cycles()`: returns the list of Cycles

- `tp.get_primary(<Cycle>)`: returns the primary scenario of the Cycle


> You can download the code of this step [here](https://docs.taipy.io/en/latest/getting_started/src/step_05.py) or all the steps [here](https://github.com/Avaiga/taipy-getting-started-core/tree/develop/src).

# Step 5: Scopes

[Scopes](https://docs.taipy.io/en/latest/manuals/core/concepts/scope/) determine how Data Nodes are shared between cycles, scenarios, and pipelines. The developer may decide to:

- Keep Data Nodes local to each pipeline.

- Extend the scope by sharing data nodes between a given scenario's pipelines.

- Extend the scope by sharing data nodes across all scenarios of a given cycle.

- Finally, extend the scope globally (across all scenarios of all cycles). For example, the initial/historical dataset is usually shared by all the scenarios/pipelines/cycles. It has a Global Scope and will be unique in the entire application.

To summarize, the different possible scopes are:

- _Pipeline scope_: two pipelines can reference different Data Nodes even if their names are the same. For example, we can have a _prediction_ Data Node of an ARIMA model (ARIMA pipeline) and a _prediction_ Data Node of a RandomForest model (RandomForest pipeline). A scenario can contain multiple pipelines.

- _Scenario scope (default)_: pipelines share the same Data Node within a scenario. 

- _Cycle scope_: scenarios from the same Cycle share the same Data Node.

- _Global scope_: Data Nodes are shared across all the scenarios/pipelines/cycles.

It is worth noting that the default scope for Data nodes is the Scenario scope.

<div align="center">
 <img src="https://raw.githubusercontent.com/Avaiga/taipy-getting-started-core/develop/step_05/config_05.svg" width=700>
</div>





Modifying the scope of a Data Node is as simple as changing its Scope parameter inside the configuration.

The configuration is taken in the previous step so you can copy the previous code directly.



In [21]:
from taipy.config import Scope, Frequency

historical_data_cfg = Config.configure_csv_data_node(id="historical_data",
                                                 default_path="time_series.csv",
                                                 scope=Scope.GLOBAL)
month_cfg =  Config.configure_data_node(id="month", scope=Scope.CYCLE)

month_values_cfg = Config.configure_data_node(id="month_data",
                                               scope=Scope.CYCLE)




Cycles are created based on the _creation_date_ of scenarios. In the example below, we force the creation_date to a given date (in real life, the actual creation date of the scenario gets used automatically).



In [22]:
tp.Core().run()

scenario_1 = tp.create_scenario(scenario_cfg,
                                creation_date=dt.datetime(2022,10,7),
                                name="Scenario 2022/10/7")
scenario_2 = tp.create_scenario(scenario_cfg,
                               creation_date=dt.datetime(2022,10,5),
                               name="Scenario 2022/10/5")
scenario_3 = tp.create_scenario(scenario_cfg,
                                creation_date=dt.datetime(2021,9,1),
                                name="Scenario 2021/9/1")



Scenario 1 and 2 belong to the same Cycle: since _month_ now has a **Cycle** scope, we can define _month_ just once for both scenarios: 1 and 2.




In [23]:
scenario_1.month.write(10)
scenario_3.month.write(9)
print("Scenario 1: month", scenario_1.month.read())
print("Scenario 2: month", scenario_2.month.read())
print("Scenario 3: month", scenario_2.month.read())


Results:
```
Scenario 1: month 10
Scenario 2: month 10
Scenario 3: month 9
```

Defining the _month_ of scenario 1 will also determine the _month_ of scenario 2 since they share the same Data Node. 

This is not the case for _nb_of_values_ that are of Scenario scope; each _nb_of_values_ has its own value in each scenario.


> You can download the code of this step [here](https://docs.taipy.io/en/latest/getting_started/src/step_06.py) or all the steps [here](https://github.com/Avaiga/taipy-getting-started-core/tree/develop/src).

# Step 6: Skipping tasks

Skipping tasks is an essential feature of Taipy. Running twice a function with the same input parameters will create the same output for a given pipeline or scenario. Executing this sort of function is a waste of time and resources.

Taipy Core provides for each task the _skippable_ attribute. If this attribute is set to True, Taipy Coreâ€™s scheduler will automatically detect if changes have occurred on any of the input Data Nodes of a task. If no changes have occurred, it will automatically skip the execution of that task. By default, skippable is set to False. 


<div align="center">
 <img src="https://raw.githubusercontent.com/Avaiga/taipy-getting-started-core/develop/step_06/config_06.svg" width=700>
</div>







In [24]:
task_filter_cfg = Config.configure_task(id="filter_by_month",
                                             function=filter_by_month,
                                             input=[historical_data_cfg, month_cfg],
                                             output=month_values_cfg,
                                             skippable=True)

task_count_values_cfg = Config.configure_task(id="count_values",
                                                 function=count_values,
                                                 input=month_values_cfg,
                                                 output=nb_of_values_cfg,
                                                 skippable=True)



The configuration is almost the same. `skippable=True` are added to the tasks we want to be skipped.

Here we create three different scenarios with different creation dates and names. Scenario 1 and scenario 2 belong to the same cycle.




In [25]:
tp.Core().run()

scenario_1 = tp.create_scenario(scenario_cfg,
                                creation_date=dt.datetime(2022,10,7),
                                name="Scenario 2022/10/7")
scenario_2 = tp.create_scenario(scenario_cfg,
                               creation_date=dt.datetime(2022,10,5),
                               name="Scenario 2022/10/5")
scenario_3 = tp.create_scenario(scenario_cfg,
                                creation_date=dt.datetime(2021,9,1),
                                name="Scenario 2022/9/1")


In [26]:
# scenario 1 and 2 belong to the same cycle, so 
# defining the month for scenario 1 defines the month for the scenarios in the cycle
scenario_1.month.write(10)
print("Scenario 1: month", scenario_1.month.read())
print("Scenario 2: month", scenario_2.month.read())



Results:

```
Scenario 1: month 10
Scenario 2: month 10
```

Every task has yet to be submitted, so when submitting scenario 1, all tasks will be executed.



In [27]:
print("Scenario 1: submit")
scenario_1.submit()
print("Value", scenario_1.nb_of_values.read())



Results:

```
Scenario 1: submit
[2022-12-22 16:20:09,079][Taipy][INFO] job JOB_filter_by_month_0d7836eb-70eb-4fe6-b954-0e56967831b6 is completed.
[2022-12-22 16:20:09,177][Taipy][INFO] job JOB_count_values_91214241-ce81-42d8-9025-e83509652133 is completed.
Value 849
```

When submitting scenario 2, the scheduler will skip the first task of this second scenario. Indeed, the two scenarios share the same input Data Nodes for this task, and no changes have occurred on these Data Nodes (since the last task run when we submitted scenario 1).



In [28]:
# the first task has already been executed by scenario 1
print("Scenario 2: first submit")
scenario_2.submit()
print("Value", scenario_2.nb_of_values.read())



Results:
```
Scenario 2: first submit
[2022-12-22 16:20:09,317][Taipy][INFO] job JOB_filter_by_month_c1db1f0c-6e0a-4691-b0a3-331d473c4c42 is skipped.
[2022-12-22 16:20:09,371][Taipy][INFO] job JOB_count_values_271cefd0-8648-47fa-8948-ed49e93e3eee is completed.
Value 849
```

Resubmitting the same scenario without any change will skip every task.



In [29]:
# every task has already been executed so that the scheduler will skip everything
print("Scenario 2: second submit")
scenario_2.submit()
print("Value", scenario_2.nb_of_values.read())



Results:
```
Scenario 2: second submit
[2022-12-22 16:20:09,516][Taipy][INFO] job JOB_filter_by_month_da2762d1-6f24-40c1-9bd1-d6786fee7a8d is skipped.
[2022-12-22 16:20:09,546][Taipy][INFO] job JOB_count_values_9071dff4-37b2-4095-a7ed-34ef81daad27 is skipped.
Value 849
```

This scenario is not in the same cycle. We change the month to 9, and the scheduler will complete every task. 




In [30]:
# scenario 3 has no connection to the other scenarios, so everything will be executed
print("Scenario 3: submit")
scenario_3.month.write(9)
scenario_3.submit()
print("Value", scenario_3.nb_of_values.read())



Results:
```
Scenario 3: submit
[2022-12-22 16:20:10,071][Taipy][INFO] job JOB_filter_by_month_c4d06eba-a149-4b79-9194-78972c7b7a18 is completed.
[2022-12-22 16:20:10,257][Taipy][INFO] job JOB_count_values_817df173-6bae-4742-a2c0-b8b8eba52872 is completed.
Value 1012
```  

Here, we change the input Data Node of the pipeline so Taipy will re-run the correct tasks to ensure that everything is up-to-date.




In [31]:
# changing an input data node will make the task be executed
print("Scenario 3: change in historical data")
scenario_3.historical_data.write(pd.read_csv('time_series_2.csv'))
scenario_3.submit()
print("Value", scenario_3.nb_of_values.read())



Results:

```
Scenario 3: change in historical data
[2022-12-22 16:20:10,870][Taipy][INFO] job JOB_filter_by_month_92f32135-b410-41f0-b9f3-a852c2eb07cd is completed.
[2022-12-22 16:20:10,932][Taipy][INFO] job JOB_count_values_a6a75e13-4cd4-4f7e-bc4e-d14a86733440 is completed.
Value 1012
```


> You can download the code of this step [here](https://docs.taipy.io/en/latest/getting_started/src/step_07.py) or all the steps [here](https://github.com/Avaiga/taipy-getting-started-core/tree/develop/src).

# Step 7: Execution modes

Taipy has [different ways](https://docs.taipy.io/en/latest/manuals/core/config/job-config/) to execute the code. Changing the execution mode can be useful for running multiple tasks in parallel.
- _standalone_ mode: asynchronous. Jobs can be run in parallel depending on the graph of execution if _max_nb_of_workers_ > 1.
- _development_ mode: synchronous.

In this step, we define a new configuration and functions to showcase the two execution modes.



In [32]:
# Normal function used by Taipy
def double(nb):
    return nb * 2

def add(nb):
    print("Wait 10 seconds in add function")
    time.sleep(10)
    return nb + 10



<div align="center">
 <img src="https://raw.githubusercontent.com/Avaiga/taipy-getting-started-core/develop/step_07/config_07.svg" width=700>
</div>



This line of code will change the execution mode (the default execution mode is _development_). Changing it to _standalone_ will make Taipy Core asynchronous. Here a maximum of two tasks will be able to run concurrently.



In [33]:
Config.configure_job_executions(mode="standalone", max_nb_of_workers=2)


In [34]:
if __name__=="__main__":
    tp.Core().run()
    scenario_1 = tp.create_scenario(scenario_cfg)
    scenario_1.submit()
    scenario_1.submit()

    time.sleep(30)



Jobs from the two submissions are being executed simultaneously. If `max_nb_of_workers` was greater, we could run multiple scenarios at the same time and multiple tasks of a scenario at the same time.

Some options for the [_submit_](https://docs.taipy.io/en/latest/manuals/reference/taipy.core.Scenario/#taipy.core.scenario.scenario.Scenario.submit) function exist:
- _wait_: if _wait_ is True, the submit is synchronous and will wait for the end of all the jobs (if _timeout_ is not defined).
- _timeout_: if _wait_ is True, Taipy will wait for the end of the submission up to a certain amount of time.



In [35]:
if __name__=="__main__":
    tp.Core().run()
    scenario_1 = tp.create_scenario(scenario_cfg)
    scenario_1.submit(wait=True)
    scenario_1.submit(wait=True, timeout=5)


> You can download the code of this step [here](https://docs.taipy.io/en/latest/getting_started/src/step_08.py) or all the steps [here](https://github.com/Avaiga/taipy-getting-started-core/tree/develop/src).

# Step 8: Scenario comparison

This step reuses the configuration provided in the previous step except for the [scenario configuration](https://docs.taipy.io/en/latest/manuals/core/entities/scenario-cycle-mgt/#compare-scenarios).

<div align="center">
 <img src="https://raw.githubusercontent.com/Avaiga/taipy-getting-started-core/develop/step_08/config_08.svg" width=700>
</div>

Taipy provides a mechanism to compare scenarios by providing a function directly into the scenario's configuration.

## Step 1: The first step consists in declaring on which data nodes to apply the comparison functions:

Taipy can compare Data Nodes. In this example, we want a comparison applied to the '_output_' Data Node. It is indicated in the comparators parameter of the `configure_scenario()`.



In [36]:
scenario_cfg = Config.configure_scenario(id="multiply_scenario",
                                        name="my_scenario",
                                        pipeline_configs=[pipeline_cfg],
                                        comparators={output_data_node_cfg.id: compare_function},
                                        frequency=Frequency.MONTHLY)


## Step 2: Implement the comparison function (`compare_function()`) used above.

_data_node_results_ is the list of the Output Data Nodes from all scenarios passed in the comparator. We iterate through it to compare scenarios.



In [37]:
def compare_function(*data_node_results):
    compare_result= {}
    current_res_i = 0
    for current_res in data_node_results:
        compare_result[current_res_i]={}
        next_res_i = 0
        for next_res in data_node_results:
            print(f"comparing result {current_res_i} with result {next_res_i}")
            compare_result[current_res_i][next_res_i] = next_res - current_res
            next_res_i += 1
        current_res_i += 1
    return compare_result



Now, the `compare_scenarios()` can be used within Taipy.



In [38]:
tp.Core().run()

scenario_1 = tp.create_scenario(scenario_cfg)
scenario_2 = tp.create_scenario(scenario_cfg)


print("\nScenario 1: submit")
scenario_1.submit()
print("Value", scenario_1.output.read())

print("\nScenario 2: first submit")
scenario_2.submit()
print("Value", scenario_2.output.read())


print(tp.compare_scenarios(scenario_1, scenario_2))



## Taipy Rest

Taipy Rest allows the user to navigate through the entities of the application but also create and submit scenarios. Taipy Rest commands are referenced [here](https://docs.taipy.io/en/latest/manuals/reference_rest/).



In [39]:
tp.Rest().run()
