# Creating Variables

Variables are created and assigned to the model using the function 

```
model.add_variables
```
where `model` is a `linopy.Model` instance. In the following we show how this function works and what the resulting variables look like. So, let's create a model and go through it!

In [None]:
import numpy as np
import pandas as pd
import xarray as xr

from linopy import Model

m = Model()

First of all it is crucial to know, that the return value of the `.add_variables` function is a `linopy.Variable` which itself contains all important information and provides helpful functions. It can have an arbitrary number of labeled dimensions. For each combination of coordinates, exactly one representative scalar variable is defined and, in the end, passed to the solver. 

The first three arguments of the `.add_variables` function are 
1. `lower` denoting the lower bound of the variables (default `-inf`) 
2. `upper` denoting the upper bound (default `+inf`)
3. `coords` (default None). 

These argument determine the shape of the added variable. 

Generally, the function is strongly aligned to the initialization of an `xarray.DataArray`. Therefore `lower` and `upper` can be 

* scalar values (int/float)
* numpy ndarray's
* pandas Series
* pandas DataFrame's
* xarray DataArray's


Note that scalars, numpy objects and pandas objects do not have or do not require dimension names. Thus, the naming of the dimensions is done by `xarray`. Therefore you can pass the `coords` argument, or alternatively, a `dims` argument in order to name your dimensions. 


.. hint::
   It is **best practice** to always define variables with explicit `name` and dimension names. This eases the inspection and avoids confusion from the automatically derived names.

Let's start by creating a simple variable:

If we just keep the default, which is `-inf` and `+inf` for `lower` and `upper`, the code returns

In [None]:
x = m.add_variables(name="x")
x

which is a variable without any coordinates and with just one optimization variable. The variable name is set by `name = 'x'`. 

Like this the variable appears with its name when defining expression with it:

In [None]:
x + 5

We can alter the lower and upper bounds of the variable by assigning scalar values to them.

In [None]:
y = m.add_variables(lower=0, upper=4, name="y")

### Variable Types

Per default the variable type is continuous, that the variables can take any real value in between and including the lower and upper bound. In order to alter the type, you have the option to set `integer` or `binary` to `True`.

In [None]:
m.add_variables(lower=0, upper=10, integer=True)

.. note::
   Since we did not set the name argument the variable name is automatically determined and set to `var0`.


This variable `var0` can take all integer number between 0 and 10 inclusively. On the other hand, when defining a binary variable, we do not specify the lower and upper bounds and set `binary` to true.

In [None]:
m.add_variables(binary=True)

### Working with dimensions

When initializing dimensional variables, it is most straight-forward and recommended to create variables with `DataArray`'s which are passed to the as `lower` and/or `upper`. 

In [None]:
lower = xr.DataArray([1, 2, 3])
v = m.add_variables(lower, name="v")
v

The returned `Variable` now has the same shape as the `lower` bound that we passed to the initialization. Since we did not specify any dimension name, it defaults to `dim_0`. In order to give the dimension a proper name we can use the `dims` argument. 

In [None]:
lower = xr.DataArray([1, 2, 3], dims=["my-dim"])
m.add_variables(lower)

You can arbitrarily broadcast dimensions when passing DataArray's with different set of dimensions. Let's do it and give `lower` another dimension than `upper`:

In [None]:
lower = xr.DataArray([1, 2, 3], dims=["my-dim"])
upper = xr.DataArray([10, 11, 12, 13], dims=["my-dim-2"])
m.add_variables(lower, upper)

Now instead of a single dimension, we end up with two dimensions `my-dim` and `my-dim-2` in the variable. This kind of **broadcasting** is a deeply incorporated in the functionality of linopy. 

We recall that, in order to improve the inspection, it is encouraged to define a `name` when creating a variable. So in your model you would rather write something like:

In [None]:
lower = xr.DataArray([1, 2, 3], dims=["time"])
upper = xr.DataArray([10, 11, 12, 13], dims=["station"])
m.add_variables(lower, upper, name="supply")

#### Initializing variables with numpy arrays

If `lower` and `upper` are numpy arrays, `linopy` it is recommended to pass a `dims` or a `coords` argument.

In [None]:
lower = np.array([1, 2])
upper = np.array([10, 10])
m.add_variables(lower, upper, dims=["my-dim"])

This is equivalent to the following

In [None]:
my_dim = pd.RangeIndex(2, name="my-dim")
lower = np.array([1, 2])
upper = np.array([10, 10])
m.add_variables(lower, upper, coords=[my_dim])

Note that 

- `dims` is a list of string defining the dimension names. 
   
- `coords` is an tuple of indexes as expected by `xarray.DataArray`. 
   
- The shape of `lower` and `upper` is aligned with `coords`.
   
- When defining the index for the coords, a name was set in the index creation. This is helpful as we can ensure which dimension the variable is defined on. 

Let's make the same example without setting an explicit dimension name:

In [None]:
coords = (pd.RangeIndex(2),)
m.add_variables(lower=lower, coords=coords)

The dimension is now called `dim_0`, any new assignment of variable without dimension names, will also use that dimension name. When combining the variables to expressions it is important that you make sure that dimension names represent what they should. 

.. hint::
  If you want to make sure, you are not messing up with dimensions, create the model with the flag `force_dim_names = True`, i.e.

In [None]:
other = Model(force_dim_names=True)
try:
    other.add_variables(lower=lower, coords=coords)
except ValueError as e:
    print("This raised an error:", e)

#### Initializing variables with Pandas objects

Pandas objects always have indexes but do not require dimension names. It is again helpful to ensure that the variable have explicit dimension names, when passing `lower` and `upper` without `coords`. This can be done by either passing the `dims` argument to the `.add_variables` function, i.e.

In [None]:
lower = pd.Series([1, 1])
upper = pd.Series([10, 12])
m.add_variables(lower, upper, dims=["my-dim"])

or naming the indexes and columns of the pandas objects directly, e.g.

In [None]:
lower = pd.Series([1, 1]).rename_axis("my-dim")
upper = pd.Series([10, 12]).rename_axis("my-dim")
m.add_variables(lower, upper)

.. note::
   Again, if `lower` and `upper` do not have the same dimension names, the arrays are broadcasted, meaning the dimensions are spanned: 

In [None]:
lower = pd.Series([1, 1]).rename_axis("my-dim")
upper = pd.Series([10, 12]).rename_axis("my-other-dim")
m.add_variables(lower, upper)

Now instead of 2 variables, 4 variables were defined.  

The similar bahvior accounts for the case when passing a DataFrame and a Series without dimension names. The index axis is the first axis of both objects, thus these are expected to be the same (Note that pandas convention, is that Series are aligned and broadcasted along the column dimension of DataFrames):  

In [None]:
lower = pd.DataFrame([[1, 1, 2], [1, 2, 2]])
upper = pd.Series([10, 12])
m.add_variables(lower, upper)

Again, one is always safer when explicitly naming the dimensions:

In [None]:
lower = lower.rename_axis(index="my-dim", columns="my-other-dim")
upper = upper.rename_axis("my-dim")
m.add_variables(lower, upper)

.. important::

    **New in version 0.3.6**

    As pandas objects always have indexes, the `coords` argument is not required and is ignored is passed. Before, it was used to overwrite the indexes of the pandas objects. A warning is raised if `coords` is passed and if these are not aligned with the pandas object. 

In [None]:
unaligned_coords = pd.Index([1, 2]), pd.Index([2, 3, 4])
m.add_variables(lower, upper, coords=unaligned_coords)

### Masking Arrays

In some cases, you want to create a variable with given dimensions, but not all parts should be active. 

For example, think about an set of ports between which goods can be transported. However, a port cannot transport goods to itself. For such a case, you would create an variable `transport` which has the dimension (`from`, `to`) with values on the diagonal disabled.  

Therefore, you can pass a `mask` argument which has `False` values on the diagonal and `True` elsewhere.

In [None]:
ports = list("abcdef")
port_from = pd.Index(ports, name="from")
port_to = pd.Index(ports, name="to")

mask = np.ones((len(ports), len(ports)), dtype=bool)
np.fill_diagonal(mask, False)
mask

In [None]:
transport = m.add_variables(
    lower=0, coords=[port_from, port_to], name="transport", mask=mask
)
transport

Now the diagonal values, for example at the variable at [a,a], are `None`. 

### Accessing assigned variables

All variables added to the model are stored in the `.variables` container.

In [None]:
m.variables

You can always access the variables from the `.variables` container either by get-item, i.e.

In [None]:
m.variables["x"]

or by attribute accessing

In [None]:
m.variables.x