# Generate measurements

For this tutorial you will generate some sample (fake) measurement data so you can post it to your project.

You're going to create a new folder and populate it with JSON files containing the fake measurement data for the whole wafer.

## Imports

In [None]:
import getpass
import itertools
from pathlib import Path

import gfhub
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from gfhub import nodes
from PIL import Image
from tqdm.notebook import tqdm

np.random.seed(42)  # always generate the same data.
user = getpass.getuser()
print(user)

## Client

In [None]:
client = gfhub.Client()
device_table_id = client.query_files(name="cutback_device_table.csv").newest().id
device_table_id

In [None]:
device_table = pd.read_csv(client.download_file(device_table_id))
device_table

## Grating Coupler Response

We can simulate a coupler response as follows:

In [None]:
def gaussian_grating_coupler_response(peak_power, center_wavelength, bandwidth_1dB, wavelength):
    """Calculate the response of a Gaussian grating coupler.

    Args:
        peak_power: The peak power of the response.
        center_wavelength: The center wavelength of the grating coupler.
        bandwidth_1dB: The 1 dB bandwidth of the coupler.
        wavelength: The wavelength at which the response is evaluated.

    Returns:
     The power of the grating coupler response at the given wavelength.

    """
    # Convert 1 dB bandwidth to standard deviation (sigma)
    sigma = bandwidth_1dB / (2 * np.sqrt(2 * np.log(10)))

    # Gaussian response calculation
    return peak_power * np.exp(-0.5 * ((wavelength - center_wavelength) / sigma) ** 2)

Let's have a look at one such responses:

In [None]:
peak_power = 1.0
center_wavelength = 1.550  # um
bandwidth_1dB = 0.100  # um

df = pd.DataFrame(
    {
        "wl [um]": (
            wls := np.linspace(center_wavelength - 0.05, center_wavelength + 0.05, 150)
        ),
        "power [dB]": gaussian_grating_coupler_response(
            peak_power, center_wavelength, bandwidth_1dB, wls
        ),
    }
)

plt.plot(df['wl [um]'], df['power [dB]'])
plt.title("Gaussian Grating Coupler Response")
plt.grid(True)
plt.xlabel('wl [um]')
plt.ylabel('power [dB]')
plt.show()

## Function

We can create a `plot_parquet` function to plot two columns in a dataframe:

In [None]:
def plot_parquet(path: Path, /, *, x: str, y: str) -> Path:
    df = pd.read_parquet(path)
    plt.plot(df[x], df[y])
    plt.xlabel(x)
    plt.ylabel(y)
    outpath = path.with_suffix(".png")
    plt.savefig(outpath, bbox_inches="tight")
    return outpath

In [None]:
func_def = gfhub.Function(plot_parquet, dependencies={
    "pandas[pyarrow]": "import pandas as pd",
    "matplotlib": "import matplotlib.pyplot as plt",
})

In [None]:
temp_path = Path('temp.parquet').resolve()
df.to_parquet(temp_path)
result = func_def.eval(temp_path, x="wl [um]", y="power [dB]")
print(result)
Image.open(result['output'])

In [None]:
client.add_function(
    name="plot_parquet", 
    script=func_def,
)

## Pipeline

Before we actually upload the data, Let's create a pipeline to plot parquet files. This pipeline will essentially generate a .png and link it to the source .parquet file that we're about to upload. By enabling the pipeline, anytime we upload a parquet file in this project it will try to convert it to png:

In [None]:
p = gfhub.Pipeline()

# auto trigger on file upload for files with the specified tags
p.trigger = nodes.on_file_upload(tags=[".parquet", "project:cutback", user])

# load files from S3 to the local file system
p.load = nodes.load()

# connect the trigger to the load node
p += p.trigger >> p.load

# once loaded we can plot the file with the function we just created above
p.plot = nodes.function(
    function="plot_parquet", kwargs={"x": "wl [um]", "y": "power [dB]"}
)

# connect the load node to the plot node
p += p.load >> p.plot

# once plotted, an output path for the plot on the local file system is created
# this one needs to be saved to S3 with a save node:
p.save = nodes.save()

# connect the plot node to the save node
p += p.plot >> p.save[0]

# we can also load the tags with which a file was uploaded:
p.load_tags = nodes.load_tags()

# this tags can also be passed on the the save node's second input
# this way the output will have the same tags as the input
p += p.trigger >> p.load_tags
p += p.load_tags >> p.save[1]

# once the pipeline is created locally we can upload it:
confirmation = client.add_pipeline(f"plot_parquet", p)

The pipeline can be viewed here:

In [None]:
print(client.pipeline_url(confirmation['id']))

if anything does not look right you can adjust the pipeline and go to the new url for it.

If everything went well, the pipeline is now uploaded and active. Any uploaded `.parquet` file with the `project:cutback` tag will automatically be processed to generate a plot for it.

## Clean up (optional)

Let's delete any existing files from this project so you can start fresh.

In [None]:
# Delete existing project files
existing_files = client.query_files(tags=[f"project:cutback", user])

# keep the files uploaded in the previous notebook
existing_files = [f for f in existing_files if f['original_name'] not in ('cutback_device_table.csv', 'cutback.gds')]

for file in tqdm(existing_files):
    client.delete_file(file['id'])

## Upload generated spectra

You can easily generate some spectrum data and add some noise to make it look like a real measurement.

In [None]:
wafer_id = "wafer1"
wafer_definitions = Path("wafer_definitions.json")
wafers = [wafer_id]
dies = [{"x": x, "y": y} for y in range(-3, 4) for x in range(-3, 4) if not (abs(y) == 3 and abs(x) == 3)]

In [None]:
cwd = Path.cwd()
grating_coupler_loss_dB = 3
device_loss_dB = 0.1
noise_peak_to_peak_dB = device_loss_dB / 10
device_loss_noise_dB = device_loss_dB / 10 * 2
for wafer, die, row in tqdm(
    list(itertools.product(wafers, dies, device_table.to_numpy()))
):
    die = f"{(die['x'])},{(die['y'])}"
    cell, dev_x, dev_y, components = row
    device = f"{dev_x},{dev_y}"
    T = 25.0  # temperature
    loss_dB = 2 * grating_coupler_loss_dB + components * (
        device_loss_dB + device_loss_noise_dB * np.random.rand()
    )
    peak_power = 10 ** (-loss_dB / 10)
    output_power = gaussian_grating_coupler_response(
        peak_power, center_wavelength, bandwidth_1dB,wls 
    )
    output_power = np.array(output_power)
    output_power *= 10 ** (
        noise_peak_to_peak_dB * np.random.rand(wls.shape[0]) / 10
    )
    output_power = 10 * np.log10(output_power)
    df = pd.DataFrame({
        "wl [um]": wls,
        "power [dB]": output_power,
    })
    client.add_file(
        df,
        tags=[
            user,
            f"project:cutback",
            f"wafer:{wafer}",
            f"die:{die}",
            f"cell:{cell}",
            f"device:{device}",
            f"T:{T}",
            f"components:{components}",
        ],
        filename="cutback_device.parquet",
    )