<a href="https://colab.research.google.com/github/nickguimara/IHP-Open-DesignLib/blob/main/main_notebook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Coherent Noise Removal Filter for Particle Physics Applications: MARTIn chip**

```
Submission to IEEE SSCS Open-Source Ecosystem “Code-a-Chip” Travel Grant Awards at ISSCC'25
SPDX-License-Identifier: Apache-2.0

```

|Name|Affiliation|IEEE Member|SSCS Member|
|:--:|:----------:|:----------:|:----------:|
|Nicolas Abgar Tavares Guimarães (Lead) <br /> Email ID: nicolas.abgar@usp.br|USP, Brazil <br /> |Yes|Yes|
|Guilherme Kenji Aniya Milani <br /> Email ID: guilherme.kenji@usp.br|USP, Brazil <br /> |No|No|
|Bruno Cavalcante de Souza Sanchez (Advisor) |USP, Brazil <br /> |Yes|Yes|


#Table of Contents

<ol>
        <li><strong>Introduction</strong></li>
        <li>
            <strong>Algorithms</strong>
            <ol>
                <li><strong>Bitonic sorting</strong></li>
                <li><strong>Bit-wise Median Finder (BWMF) algorithm</strong></li>
                <li><strong>Combinatorial Sum Median Finder (CSMF) algorithm</strong></li>
                <li><strong>Tests and results</strong></li>
            </ol>
        </li>
        <li><strong>MARTIn chip</strong></li>
        <li>
            <strong>RTL to GDS flow</strong>
            <ol>
                <li><strong>Environment setup</strong></li>
                <li><strong>Getting source files from GitHub</strong></li>
                <li><strong>Converting System Verilog source files to Verilog</strong></li>
                <li><strong>Generating the sealring</strong></li>
                <li><strong>OpenROAD flow</strong></li>
                <li><strong>Filler scripts</strong></li>
            </ol>
        </li>
        <li>
            <strong>Verification</strong>
            <ol>
                <li><strong>Post flow simulation</strong></li>
                <li><strong>DRC</strong></li>
            </ol>
        </li>
        <li><strong>Results</strong></li>
        <li><strong>Visualization</strong></li>
        <li><strong>Conclusion</strong></li>
        <li><strong>References</strong></li>
    </ol>

#1. Introduction

Micropattern Gaseous Detectors (MPGDs) are highly precise devices used for particle detection in high-energy physics experiments. One notable application is in ALICE (A Large Ion Collider Experiment), one of the four big experiments at the CERN's (European Organization for Nuclear Research) Large Hadron Collider (LHC). The experiment investigates the physics of strongly interacting matter, known as quark-gluon plasma, formed in lead-ion collisions [1].


<div style="display: flex; justify-content: space-around;">
    <div style="margin: 10px; text-align: center;">
        <img src="https://drive.google.com/uc?export=view&id=1FPYASbGd2f-He58vtIiUI8VTPgv6bXZq" alt="Image 1" style="width: 400px; height: auto;">
        <p>Fig 1. Alice TPC <a href="https://www.researchgate.net/publication/241772093_Identifying_Charged_Hadrons_on_the_Relativistic_Rise_Using_the_ALICE_TPC_at_LHC" target="_blank">(Source)</a></p>
    </div>
    <div style="margin: 10px; text-align: center;">
        <img src="https://drive.google.com/uc?export=view&id=13AWx9Pm_v3-qzgsUSjOfM0pg8ScpGKLf" alt="Image 2" style="width: 400px; height: auto;">
        <p>Fig 2. Instalation of Gas Eletron Multiplier (GEM) pad, a type of MPGD<a href="https://cds.cern.ch/record/2758225/plots" target="_blank">(Source)</a></p>
    </div>
</div>

![](https://drive.google.com/uc?export=view&id=17X0G9kFhVLHFeOF_xY1Ye2sMh5BYoPJ5)
Fig 3. ALICE experiment ([Source](https://cds.cern.ch/record/2263642))



In this context, the SAMPA chip was developed, a 32-channel ASIC, designed in order  to upgrade the readout front-end electronics of the ALICE Time Projection Chamber (TPC) and Muon Chamber (MCH) after the LHC second long shutdown. The chip is responsible for signal amplification, digital conversion, processing, compression and filtering of the signals from the detectors [2].

Furthermore, the [SALSA](https://indico.cern.ch/event/1327482/contributions/5692916/) chip project - being developed in a collaboration of the University of São Paulo (USP) and the Institute of Research into the Fundamental Laws of the Universe (IRFU) - aims to be a new versatile readout chip for MPGDs of the future experiments of the Electron-Ion Collider (EIC) project, which will be installed at the Brookhaven National Laboratory in the USA in early 2030 [3].

This new chip is an ASIC designed to read, amplify and
convert analog signals captured by the detectors, utilizing an
ADC [4]. It also accommodates a wider range of signal amplitudes and electrode capacitance, while proposing a larger range of peaking time and optimized data processing in comparison with SAMPA. Its architecture incorporates digital signal filters to mitigate noise interference, ensuring high accuracy in particle collision data. While the current filtering process employs FIR, IIR and slope based filters, a new method based on median filtering which incorporates the implementation of a common-mode noise (CMN) subtraction method is being tested.

The common mode noise (CMN) subtraction is a non-linear digital filter that aims to reduce the coherent noise affecting all channels. This includes the evaluation of the coherent noise and subtraction from each channel’s sample [5]. This evaluation involves calculating the median of the signals from the 64 channels in real time to be used for reducing noise interference and getting better precision in the data acquisition. For the calculation of the channels' median, three different approaches are considered, each employing distinct algorithms to achieve efficient and accurate results:

#2. Algorithms

##2.1) Bitonic sorting
Bitonic sort is a highly parallel sorting algorithm based on the concept of a bitonic sequence. In other words, a sequence of numbers that first monotonically increases and then monotonically decreases, or vice versa. This method was chosen as the basis for comparison with the other two newly made algorithms.

The sorting process consists of two main steps. First, the unsorted list is recursively divided into smaller sub-sequences, halving the list each time until the smallest units, consisting of 2 elements, are formed. Each small sub-sequence is transformed into a bitonic sequence, where one portion is sorted in ascending order and the other in descending order. In each subsequent stage, the number of elements in the sequences doubles until the full list of 64 values is reached. Then, by comparing and swapping pairs of elements, these bitonic sequences are merged and sorted, moving the largest elements to the lower half and the lowest to the upper half, ensuring the final sequence is in the correct order [6].

In a hardware implementation, two types of comparators are used: the ascending order comparator(AO Comparator) and the descending order comparator (DO Comparator). Each comparator takes two input values and two sorted outputs values according to its respective order. These comparators enable the transformation into smaller bitonic sequences. By recursively doubling the size of the sequence at each stage, the algorithm sorts the full set of 64 data values in ascending order. Once sorted, the median is found as the 33rd element (As the SALSA chip has 64 channels, the median value is considered to be the 33rd greatest value for simplification).



![](https://drive.google.com/uc?export=view&id=1tHkjwhY_MyDJrTq3SuGkZHiGqusY-Cpv)

Fig. 4 Diagram for the Bitonic Sort Algorithm

##2.2) Bit-wise Median Finder (BWMF) algorithm
The bit-wise median finder is a new algorithm developed by the SALSA team that utilizes a pipeline implementation to efficiently compute the median by processing the 12-bit input data across multiple stages. Running from the MSB to the LSB of each 12-bit input data register, one bit at each stage iteration, the algorithm uses the calculation of partial median values, estimated by the majority quantity of numbers 1’s and 0’s in a determined bit position. By comparing the partial median value with the actual data bit of each channel, it is possible to refine the candidate values for the final median.

Once this partial median is established for the current iteration, the algorithm compares it with the actual bit in the 12-bit register. If it does not match with the majority bit (partial median), that data bit register is marked as a non candidate for the final median in subsequent stages and it is all replaced with 1’s or 0’s depending on the registered median value for that stage. Conversely, if the bit matches the partial median, that value remains a potential candidate for the final result and stays in the collection of values for further recursive comparisons.

This process repeats iteratively across all bit positions, progressively refining the set of candidate values and storing the partial medians in a 12-bit register. By the end of the pipeline, the full median is registered for an output.


![](https://drive.google.com/uc?export=view&id=1F_p7qS5uWsfMeH1_2mUnaTvJmKvG1L9U)


Fig. 5 Diagram of a generic stage of the MWMF Algorithm

##2.3) Combinatorial Sum Median Finder (CSMF) algorithm
The combinatorial sum median finder is a hardware implemented algorithm developed by the authors that aims to find the median value from a collection of data in a highly parallelized manner.

The method utilized here to find the median value is based on the fact the median value (here assumed as the 33rd greatest value of the data collection) is bigger than 32 other values. Therefore, by comparing every two channels in a
combinatorial manner resulting in 2016 comparisons (C<sub>64,2</sub> ), we can calculate for each channel value how many channels have smaller samples. With that, there should be a channel that has 0 samples smaller than it all the way to a channel that has 63 samples smaller than it. So the channel whose number of smaller samples is 32 should be the 33rd greatest number, in other words, the wanted median.

With the hardware implementation in mind, each channel has two individual components: a Hamming weight component (counts the number of ones in its input) and a comparator that compares the output of the 1’s counter with 32. Furthermore there are 2016 comparators, one for every combination of two channels (channel X and Y). This comparator has two outputs one for X ≥ Y and other for X < Y . The first output goes to the 1’s counter of the channel X and the other goes to the 1’s counter of the Y channel. With that circuitry, the
1’s counter of each channel counts the number of channels with smaller samples, that number goes to comparator of that channel, and if it is equal to 32, this channel is the one that contains the median value and therefore should be the
output. As the three aforementioned componentes are implemented in combinational logic, this design manages to find the median value in one clock cycle.


![](https://drive.google.com/uc?export=view&id=1B_DHov0GJN-EZHnjczaAafIlBUYkXCJE)

Fig. 6 Diagram for the Combinatorial Sum Median Finder

##2.4) Tests and results
To evaluate the efficiency, accuracy and practicality of the three proposed algorithms (Bitonic Sort, MWMF and CSMF), a series of tests for power, area and number of flip-flops required for their hardware implementation were conducted by the team. The tests were done in a different technology and flow process, so they are presented here just for clarification in the choosing of the algorithm to be implemented.

<table style="width:100%; border:1px solid black; border-collapse: collapse;">
  <caption>Comparison between filters with 64 channels and 50MHz</caption>
  <thead>
    <tr>
      <th>Filter</th>
      <th colspan="2">Bitonic</th>
      <th colspan="2">BWMF</th>
      <th colspan="2">CSMF</th>
    </tr>
    <tr>
      <th>Number of bits</th>
      <th>12</th>
      <th>10</th>
      <th>12</th>
      <th>10</th>
      <th>12</th>
      <th>10</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Number of cycles</td>
      <td>6</td>
      <td>6</td>
      <td>12</td>
      <td>10</td>
      <td>1</td>
      <td>1</td>
    </tr>
    <tr>
      <td>Area [mm<sup>2</sup>]</td>
      <td>0.089</td>
      <td>0.084</td>
      <td>0.055</td>
      <td>0.039</td>
      <td>0.141</td>
      <td>0.115</td>
    </tr>
    <tr>
      <td>Power [mW]</td>
      <td>11.25</td>
      <td>10.62</td>
      <td>5.84</td>
      <td>3.5</td>
      <td>9.08</td>
      <td>7.41</td>
    </tr>
    <tr>
      <td>Number of FFs</td>
      <td>3850</td>
      <td>3850</td>
      <td>5070</td>
      <td>3575</td>
      <td>12</td>
      <td>10</td>
    </tr>
  </tbody>
</table>


The criterias of the comparison are: the smaller the area, the better, as the cost of implementation in an ASIC is proportional to the area and SALSA will be a big design. Concerning the power, it is desired to have an economic design. In addition, due to the operation in a highly radioactive environment the number of flip-flops is important, because the higher the number of that digital component, the greater the chances of a bit-flip occurring, in other words, it becomes less error tolerant. Also, if the number of FFs is smaller, it is easier to implement a triple modular redundancy (TMR) as it was used in SAMPA [7], this technique triplicates the number of storage components as a countermeasure to the bit-flip events. And lastly, it is good to have a small number of cycles so that the pipeline, designed to wait for the median to be found, takes less area and power.

In the first approach, the sorting method using the bitonic algorithm was the most power hungry of them, while also being the intermediate option in terms of area, number of cycles and number of flip-flops. Its high power consumption
can discourage utilization in power constrained applications and also thermally sensitive environments as it generates more heat due to continuous operation.

The BWMF algorithm proved to be a good option if the throughput is not the primary concern, since it has a small area and power consumption. However the big number of FFs requires a larger pipeline, which drains more power, occupy
more area and increases the filter’s latency. Additionally, this high flip-flop count would make this approach more susceptible to radiation events and also harder to implement with TMR, as triplicating all flip-flops would significantly
increase the area. Therefore, this approach is suitable to lowpower applications, but less suitable for fast response times due to its high latency and may not be optimal for highly radioactive environments.

Lastly, the CSMF algorithm is the best option when considering throughput, as it has the smallest number of cycles and its competitors cannot reach the same performance. Its area is considerably larger than the others, but the small number of flip-flops can compensate if TMR is implemented (the triplication of a small number of FFs generate a small increase in area) and also the smaller number of cycles helps reduce the overall area as the pipeline required for its
implementation is smaller. Thus this design is well suited to highly radioactive environments considering its low flip-flop count, which simplifies the TMR implementation whilst maintaining acceptable dimensions and power consumption.

In conclusion, the CSMF algorithm turned to be a promising approach for implementing a median filter in the context of high-energy physics experiments. Its balance between accuracy, computational efficiency, and moderate resource usage makes it particularly well-suited to the demanding real-time data processing requirements of this field.

To validate this algorithm, the authors proposed a smaller chip with only 8 channels, each with 8 bits to test the proposed CSMF algorith. This chip endend up being called MARTIn (Median Algorithm Reduced Test Interface).

#3. MARTIn chip

The MARTIn is a simplified chip which contains the implementation of the CSMF algorithm, while also encompassing a LFSR (Linear Feedback Shift-Register), that works as a pseudo random number generator, for testing maximum clock speeds, and a shift register for radiation tolerance testing.

The MARTIn chip was implemented using the IHP 130nm [BiCMOS Open Source PDK](https://github.com/IHP-GmbH/IHP-Open-PDK) due to the possibility of costless fabrication according to [IHP Open Design Lib](https://ihp-open-ip.readthedocs.io/en/latest/). The chip was submitted in the November 2024 Round and accepted [(To November 2024)](https://github.com/IHP-GmbH/TO_Nov2024). So the following design proposed in this presentation is ready for production with 0 DRC errors.

Now concerning the algorithms implementation, it was implemented in a simple filter format shown below, it has 8 input register (emulating the 8 channels) connected to the median algorithm and a MUX afterwards to select the operation mode. With that implementation format the chip can operate in three modes:
 - __Transparency mode:__ input data goes directly to the output
 - __Median Output:__ output is the median value
 - __Filter Mode:__ output = input - median value

![](https://drive.google.com/uc?export=view&id=1lGQ7UFdQAthVlYsqeAD7IOK7W_5KGO2u)
Fig. 7 MARTIn block diagram

This design is composed of 8 pins in each side of the chip with a total of 64 pins. They are the following:
- VDD, VSS, IOVDD, IOVSS : power pins
- clk, rst : general control signals
- data_in : data input for internal registers or median filter
- data_out : data output from median filter
- wr_enable : enable write in internal register
- reg_addr : selects the register to be written
- out_select : selects from median filter output, transparency mode and filter mode
  - 2'b00: data_out = median_value
  - 2'b01: Filter mode (data_out = data_in - median value)
  - 2'b10: Transparency mode (data_out = data_in)
  - 2'b00: data_out = median_value
- shreg_in : shift register input
- shreg_out : shift register output
- lfsr_out : Linear-feedback shift register output (Pseudo random number generator)
- aux_enable : enables the LFSR and shift register

![](https://drive.google.com/uc?export=view&id=1AocEzrOp0OUwoL4mmol5EIo2OcPV8-C_)
Fig. 8 MARTIn pinout

The development team aims to test this chip after production with basic tests to ensure the correct functionaly of the algorithm, some tests with real experiment data and also some radiation tests to prove this design is beffiting to be implemented in the SALSA chip.

#4. RTL to GDS flow

The RTL to GDS flow utilized here is based in these steps:
1. Convert System Verilog to Verilog (sv2v)
2. Synthesis (Openroad/Openroad/OpenROAD-flow-scripts)
3. Floorplan  (Openroad/OpenROAD-flow-scripts)
4. Placement  (Openroad/OpenROAD-flow-scripts)
5. CTS  (Openroad/OpenROAD-flow-scripts)
6. Routing  (Openroad/OpenROAD-flow-scripts)
7. Sealring generation (Klayout)
8. GDS Merging (Klayout/OpenROAD-flow-scripts)
9. Metal filling (Klayout)

All of the steps are done utilizing open source software and the process described below shows how each one of them should be used to generate the MARTIn chip from the source code.

The flow utilizes the [IHP 130nm BiCMOS Open Source PDK](https://github.com/IHP-GmbH/IHP-Open-PDK) and is based on the flow utilized the [i2c-gpio-expander](https://github.com/aesc-silicon/i2c-gpio-expander/tree/main) project on github.

##4.1) Environment setup

The environment can be prepared used the following scripts that install the dependencies and tools required by the flow. This environment is the same utilized in the i2c-gpio-expander with some alterations to utilize a more recent version of the IHP PDK.

In [None]:
!sudo apt-get update && apt-get install -y \
    sudo \
    apt-utils \
    software-properties-common \
    ssh \
    git \
    curl \
    time \
    libtool-bin \
    autotools-dev \
    automake \
    pkg-config \
    libyaml-dev \
    libssl-dev \
    gdb \
    ninja-build \
    flex \
    bison \
    libfl-dev \
    cmake \
    libftdi1-dev \
    python3.10 \
    python3.10-dev \
    python3-pip \
    libpython3.10 \
    virtualenv \
    openjdk-11-jdk-headless \
    verilator \
    gtkwave \
    libcanberra-gtk-module \
    libcanberra-gtk3-module \
    libtinfo5 \
    libncurses5 \
    klayout

#Yosys
!wget https://github.com/YosysHQ/oss-cad-suite-build/releases/download/2024-07-17/oss-cad-suite-linux-x64-20240717.tgz && \
    tar -xvf oss-cad-suite-linux-x64-20240717.tgz && \
    rm oss-cad-suite-linux-x64-20240717.tgz

#Install Openroad
!wget https://github.com/Precision-Innovations/OpenROAD/releases/download/2024-08-05/openroad_2.0_amd64-ubuntu20.04-2024-08-05.deb && \
    sudo apt install -y ./openroad_2.0_amd64-ubuntu20.04-2024-08-05.deb && \
    rm openroad_2.0_amd64-ubuntu20.04-2024-08-05.deb
#Install Klayout
!wget https://www.klayout.org/downloads/Ubuntu-22/klayout_0.29.0-1_amd64.deb && \
    sudo apt install -y ./klayout_0.29.0-1_amd64.deb && \
    rm klayout_0.29.0-1_amd64.deb
#Install OpenRoad-Flow-Scripts
!git clone https://github.com/The-OpenROAD-Project/OpenROAD-flow-scripts.git &&\
    cd OpenROAD-flow-scripts && \
    git checkout d617deb35b6823c03846bacfefbd838f49cff437

#Install IHP PDK
!git clone --progress https://github.com/IHP-GmbH/IHP-Open-PDK.git && \
    cd IHP-Open-PDK && \
    git checkout 4c6508d


[1;30;43mA saída de streaming foi truncada nas últimas 5000 linhas.[0m
oss-cad-suite/share/icons/Adwaita/48x48/places/user-bookmarks-symbolic.symbolic.png
oss-cad-suite/share/icons/Adwaita/48x48/places/folder-saved-search-symbolic.symbolic.png
oss-cad-suite/share/icons/Adwaita/48x48/places/user-trash-symbolic.symbolic.png
oss-cad-suite/share/icons/Adwaita/48x48/places/user-home-symbolic.symbolic.png
oss-cad-suite/share/icons/Adwaita/48x48/places/folder-documents-symbolic.symbolic.png
oss-cad-suite/share/icons/Adwaita/48x48/places/folder-publicshare.png
oss-cad-suite/share/icons/Adwaita/48x48/places/folder-templates-symbolic.symbolic.png
oss-cad-suite/share/icons/Adwaita/48x48/places/folder-publicshare-symbolic.symbolic.png
oss-cad-suite/share/icons/Adwaita/48x48/places/folder-templates.png
oss-cad-suite/share/icons/Adwaita/48x48/places/network-workgroup-symbolic.symbolic.png
oss-cad-suite/share/icons/Adwaita/48x48/places/folder-pictures.png
oss-cad-suite/share/icons/Adwaita/48x48/pla

In [None]:
!cd /content/IHP-Open-PDK/ihp-sg13g2/libs.tech/klayout/python && git clone https://github.com/IHP-GmbH/pypreprocessor.git
!cd /content/IHP-Open-PDK/ihp-sg13g2/libs.tech/klayout/python && git clone https://github.com/IHP-GmbH/pycell4klayout-api.git

Cloning into 'pypreprocessor'...
remote: Enumerating objects: 450, done.[K
remote: Counting objects: 100% (14/14), done.[K
remote: Compressing objects: 100% (9/9), done.[K
remote: Total 450 (delta 4), reused 8 (delta 2), pack-reused 436 (from 1)[K
Receiving objects: 100% (450/450), 90.82 KiB | 543.00 KiB/s, done.
Resolving deltas: 100% (214/214), done.
Cloning into 'pycell4klayout-api'...
remote: Enumerating objects: 59, done.[K
remote: Counting objects: 100% (59/59), done.[K
remote: Compressing objects: 100% (30/30), done.[K
remote: Total 59 (delta 25), reused 53 (delta 22), pack-reused 0 (from 0)[K
Receiving objects: 100% (59/59), 404.97 KiB | 10.38 MiB/s, done.
Resolving deltas: 100% (25/25), done.


##4.2) Getting source files from github

The verilog, system verilog and config files for the OpenROAD flow scripts can be found in the github repository for [MARTIn](https://github.com/nickguimara/MARTIn_ISSCS.git)

In [None]:
!git clone https://github.com/nickguimara/MARTIn_ISSCS.git
!mv ./MARTIn_ISSCS ./MARTIn

Cloning into 'MARTIn_ISSCS'...
remote: Enumerating objects: 55, done.[K
remote: Counting objects: 100% (55/55), done.[K
remote: Compressing objects: 100% (38/38), done.[K
remote: Total 55 (delta 16), reused 48 (delta 12), pack-reused 0 (from 0)[K
Receiving objects: 100% (55/55), 17.27 KiB | 3.45 MiB/s, done.
Resolving deltas: 100% (16/16), done.


##4.3) Converting System Verilog source files to Verilog

As the synthesys tool (Yosys) utilized by the Openroad flow do not support System Verilog design files, they need to be converted to verilog files. To accomplish that, the [sv2v](https://github.com/zachjs/sv2v.git) tool is utilized.

In [None]:
!curl -sSL https://get.haskellstack.org/ | sh
!git clone https://github.com/zachjs/sv2v.git
!cd sv2v; make
!./sv2v/bin/sv2v ./MARTIn/src/*.sv > ./MARTIn/src/filter.v

Detected Linux distribution: ubuntu

Installing dependencies...


About to use 'sudo' to run the following command as root:
    apt-get install -y libffi-dev libgmp-dev netbase
in order to install required system dependencies.

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages were automatically installed and are no longer required:
  libbz2-dev libpkgconf3 libreadline-dev
Use 'sudo apt autoremove' to remove them.
The following additional packages will be installed:
  libgmpxx4ldbl
Suggested packages:
  gmp-doc libgmp10-doc libmpfr-dev
The following NEW packages will be installed:
  libffi-dev libgmp-dev libgmpxx4ldbl netbase
0 upgraded, 4 newly installed, 0 to remove and 52 not upgraded.
Need to get 423 kB of archives.
After this operation, 2,080 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy/main amd64 netbase all 6.3 [12.9 kB]
Get:2 http://archive.ubuntu.com/ubuntu jammy/mai

## 4.4) Generating the sealring

For the chip to be produced, it is required for it to have a sealring. The script to generate it can be found in the IHP PDK.

Here a 1370 um x 1370 um sealring is generated to encapsulate the chip.


In [None]:
width = 1370
height = 1370
!export KLAYOUT_HOME=/content/IHP-Open-PDK/ihp-sg13g2/libs.tech/klayout && klayout -n sg13g2 -zz -r /content/IHP-Open-PDK/ihp-sg13g2/libs.tech/klayout/tech/scripts/sealring.py \
		-rd width={width} -rd height={height} \
		-rd output=/content/MARTIn/cfg/macros/sealring.gds

[0m

![](https://drive.google.com/uc?export=view&id=1jvNdJeba3x2h1PaQ8CYC_5oKvSyYgKld)

Fig. 9 Sealring GDS

## 4.5) Openroad flow

Using the converted files, the sealring GDS and config files, the Openroad flow can be ran.


In [None]:
!cd OpenROAD-flow-scripts/flow ; source ./../env.sh  && make DESIGN_CONFIG=./../../MARTIn/cfg/config.mk YOSYS_CMD=/content/oss-cad-suite/bin/yosys OPENROAD_EXE=/usr/bin/openroad

OPENROAD: /content/OpenROAD-flow-scripts/tools/OpenROAD
/content/OpenROAD-flow-scripts/flow/util/preprocessLib.py -i /content/OpenROAD-flow-scripts/flow/platforms/ihp-sg13g2/lib/sg13g2_stdcell_typ_1p20V_25C.lib -o objects/ihp-sg13g2/martin/base/lib/sg13g2_stdcell_typ_1p20V_25C.lib
Opening file for replace: /content/OpenROAD-flow-scripts/flow/platforms/ihp-sg13g2/lib/sg13g2_stdcell_typ_1p20V_25C.lib
Commented 0 lines containing "original_pin"
Replaced malformed functions 0
Writing replaced file: objects/ihp-sg13g2/martin/base/lib/sg13g2_stdcell_typ_1p20V_25C.lib
/content/OpenROAD-flow-scripts/flow/util/preprocessLib.py -i /content/OpenROAD-flow-scripts/flow/platforms/ihp-sg13g2/lib/sg13g2_io_dummy.lib -o objects/ihp-sg13g2/martin/base/lib/sg13g2_io_dummy.lib
Opening file for replace: /content/OpenROAD-flow-scripts/flow/platforms/ihp-sg13g2/lib/sg13g2_io_dummy.lib
Commented 0 lines containing "original_pin"
Replaced malformed functions 0
Writing replaced file: objects/ihp-sg13g2/martin/b

![](https://drive.google.com/uc?export=view&id=1qwt8LPm97phq8q_SQfM1-E4umKp7iUJv)

Fig. 10 Openroad final GDS

## 4.6) Filler scripts

To guarantee the non violation of the DRC rules required by the IHP production, mainly those related to metal and activ densities, a filling script is required. It can be found in the IHP PDK.

In [None]:
!export KLAYOUT_HOME=/content/IHP-Open-PDK/ihp-sg13g2/libs.tech/klayout && \
 export PDK_ROOT=/content/IHP-Open-PDK && \
 export PDK=ihp-sg13g2 && \
  klayout  -n sg13g2 -zz -r /content/IHP-Open-PDK/ihp-sg13g2/libs.tech/klayout/tech/scripts/filler.py \
		-rd output_file=/content/OpenROAD-flow-scripts/flow/results/ihp-sg13g2/martin/base/6_final.gds \
		/content/OpenROAD-flow-scripts/flow/results/ihp-sg13g2/martin/base/6_final.gds

[0mStart filling ActGatP
Start filling Metal
Start filling TopMetal


#5. Verification



## 5.1) Post flow simulation

For the post flow simulation, there are two testbenches made in verilog. They can be ran using the [Icarus Verilog](https://github.com/steveicarus/iverilog) simulation tool.

The design here is simulated using the output netlist generated by the openroad flow in addition to the standard cell and IO cell libraries provided by the IHP PDK.

It is important to remark that the Icarus Verilog simulation tool do not support timing simulation, therefore do not support *specify* blocks that are declared in the IHP cell library. So in order to simulate the design some alterations are required.

In [None]:
!sudo apt-get update && apt-get install -y iverilog

0% [Working]            Hit:1 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease
Hit:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease
Hit:3 http://security.ubuntu.com/ubuntu jammy-security InRelease
Hit:4 http://archive.ubuntu.com/ubuntu jammy InRelease
Hit:5 https://r2u.stat.illinois.edu/ubuntu jammy InRelease
Hit:6 http://archive.ubuntu.com/ubuntu jammy-updates InRelease
Hit:7 http://archive.ubuntu.com/ubuntu jammy-backports InRelease
Hit:8 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu jammy InRelease
Hit:9 https://ppa.launchpadcontent.net/graphics-drivers/ppa/ubuntu jammy InRelease
Hit:10 https://ppa.launchpadcontent.net/ubuntugis/ppa/ubuntu jammy InRelease
Reading package lists... Done
W: Skipping acquire of configured file 'main/source/Sources' as repository 'https://r2u.stat.illinois.edu/ubuntu jammy InRelease' does not seem to provide it (sources.list entry misspelt?)
Reading package lists... Done
Building depe

Here, the standard cell library is edited to remove *specify* blocks in critical components used by the design and substitute them with a functional description.

In the case of the MARTIn chip, the only alteration necessary was in the D Flip Flop.

In [None]:
import re

to_change = "/content/IHP-Open-PDK/ihp-sg13g2/libs.ref/sg13g2_stdcell/verilog/sg13g2_stdcell.v"
substitution = """
   output reg Q, Q_N;
   input D, RESET_B, CLK;
   reg notifier;
   wire delayed_D, delayed_RESET_B, delayed_CLK;

   // Function
   wire int_fwire_IQ, int_fwire_IQN, int_fwire_r;
   wire xcr_0;


   //Functional description
   always @(posedge CLK or negedge RESET_B) begin
      if(RESET_B == 1'b0) begin
         Q <= 1'b0;
      end else begin
         Q <= D;
         Q_N <= ~D;
      end
   end
"""
changed_file = "/content/MARTIn/tb/stdcell.v"

with open(to_change, "r") as fl:
    text = fl.read()

pattern = r'(module sg13g2_dfrbp_1 \(Q, Q_N, D, RESET_B, CLK\);\n)(.*?)(endmodule)'
changed = re.sub(pattern, rf'\1{substitution}\3', text, flags=re.DOTALL)

with open(changed_file, "w") as fl:
    fl.write(changed)



Now it is possible to run the testbench that checks the shift register functionality.



In [None]:
!iverilog /content/MARTIn/tb/shreg_tb.v /content/IHP-Open-PDK/ihp-sg13g2/libs.ref/sg13g2_io/verilog/sg13g2_io.v /content/MARTIn/tb/stdcell.v /content/OpenROAD-flow-scripts/flow/results/ihp-sg13g2/martin/base/6_final.v -o /content/MARTIn/tb/shreg_tb.vvp
!vvp /content/MARTIn/tb/shreg_tb.vvp

VCD info: dumpfile shreg.vcd opened for output.
Test started!
Test finished!


And also run the testbench that ensures the correctness of the three implemented modes (filter, transparency, median).

In [None]:
!iverilog /content/MARTIn/tb/filter_tb.v /content/IHP-Open-PDK/ihp-sg13g2/libs.ref/sg13g2_io/verilog/sg13g2_io.v /content/MARTIn/tb/stdcell.v /content/OpenROAD-flow-scripts/flow/results/ihp-sg13g2/martin/base/6_final.v -o /content/MARTIn/tb/filter_tb.vvp
!vvp /content/MARTIn/tb/filter_tb.vvp

VCD info: dumpfile filter.vcd opened for output.
1.Test - transparency test
2.Test - median test
3.Test - filter test


## 5.2) DRC

Lastly to guarantee the quality of the design for production, the minimal DRC provided by the IHP PDK is ran.

As a result, it is possible to see that the design contains no DRC error.

In [None]:
!export KLAYOUT_HOME=/content/IHP-Open-PDK/ihp-sg13g2/libs.tech/klayout && klayout -n sg13g2 -b -r /content/IHP-Open-PDK/ihp-sg13g2/libs.tech/klayout/tech/drc/sg13g2_minimal.lydrc -rd cell=martin_top /content/OpenROAD-flow-scripts/flow/results/ihp-sg13g2/martin/base/6_final.gds

[32mDRC: batch mode
[0m[32mActive cell: martin_top
[0mRule Act.a: 0 error(s)
Rule Act.b: 0 error(s)
Rule AFil.g: 0 error(s)
Rule AFil.g1: 0 error(s)
Rule AFil.g2: 0 error(s)
Rule AFil.g3: 0 error(s)
Rule TGO.f: 0 error(s)
Rule Gat.a: 0 error(s)
Rule Gat.b: 0 error(s)
Rule Gat.d: 0 error(s)
Rule GFil.g: 0 error(s)
Rule Cnt.a: 0 error(s)
Rule Cnt.b: 0 error(s)
Rule M1.a: 0 error(s)
Rule M1.b: 0 error(s)
Rule M1.j: 0 error(s)
Rule M1.k: 0 error(s)
Rule M2.a: 0 error(s)
Rule M2.b: 0 error(s)
Rule M2.j: 0 error(s)
Rule M2.k: 0 error(s)
Rule M3.a: 0 error(s)
Rule M3.b: 0 error(s)
Rule M3.j: 0 error(s)
Rule M3.k: 0 error(s)
Rule M4.a: 0 error(s)
Rule M4.b: 0 error(s)
Rule M4.j: 0 error(s)
Rule M4.k: 0 error(s)
Rule M5.a: 0 error(s)
Rule M5.b: 0 error(s)
Rule M5.j: 0 error(s)
Rule M5.k: 0 error(s)
Rule M1Fil.h: 0 error(s)
Rule M1Fil.k: 0 error(s)
Rule M2Fil.h: 0 error(s)
Rule M2Fil.k: 0 error(s)
Rule M3Fil.h: 0 error(s)
Rule M3Fil.k: 0 error(s)
Rule M4Fil.h: 0 error(s)
Rule M4Fil.k: 0 erro

#6. Results

In this section there are some interesting results from the flow.

|                      |                        |
|:--------------------:|:----------------------:|
| **Total chip area**   | 1.37 x 1.37 = 1.879 mm<sup>2</sup> |
| **Clock frequency** | 50 MHz              |
| **Number of flip-flops** | 1703         |





<img src="https://drive.google.com/uc?export=view&id=19O_eFEAZ8KZ8oCvtZRSl4b6Ff9FTn-PQ" alt="Image" width="500" height="500">

Fig. 11 Cell placement (Generated by Openroad)

<img src="https://drive.google.com/uc?export=view&id=1LG-TSJfeFx1Bcg1kGn8OMhq4LAhoDABm" alt="Image" width="500" height="500">

Fig. 12 Routing (Generated by Openroad)

<img src="https://drive.google.com/uc?export=view&id=1F506P4WzfgXU5BObR-jrlsIM1RUfDYTq" alt="Image" width="500" height="500">

Fig. 13 Clock routes (Generated by Openroad)

<img src="https://drive.google.com/uc?export=view&id=1JTF72PsxJlispYoqgDBth-hSFK9Y9Oua" alt="Image" width="500" height="500">

Fig. 14 IR Drop (Generated by Openroad)

# 7. Visualization

For a better explanation of the algorithm implemented in this chip, a Python animation was made with the [Manim](https://github.com/3b1b/manim) package. It explains the inner workings of the CSMF algorithm in a highly educational manner.

#8. Conclusion

This project was made aiming for designing a filter module to be implemented in the SALSA chip. The task of developing this part generated three approaches, with two of them having new median finding algorithms developed by our team.

One of those algorithms was innovative and promising (CSMF), so a reduced version was proposed to be created and tested. For ease of production and testing, the open source environment was choosen with the IHP SG13G2 Open Source PDK.

The chip, called MARTIn was coded, synthesized, got its layout done and was tested utilizing only open source tools. The design was accepted in the IHP November 2024 round for production and is now under production.

In addition to that the team explored a new educational tool: Python for generating animations to explain digital circuits. So utilizing the Manim package, an animation that explains the innovative median finding algorithm was made.

Therefore this project developed a functional coherent noise removal filter with a innovative algorithm for median finding and a reduced version is under production using open source PDK. With these results in hand, it is possible to show that contributions to the world of particle physics experiments, mainly in the field of microelectronics, can be done utilizing free open source tools that are avaliable to any school, university and person.

The next steps for this project are testing the physical chip to ensure this design works and also evaluate its radiation tolerance, which is an important factor in the field of particle physics experiments.

#9. References
[1] T. Francke et al., “Micropattern gaseous detectors,” in INFN Eloisatron
Project: 42nd Workshop on Innovative Detectors for Supercolliders, 4
2004, pp. 158–179.

[2] J. Alme et al., “Correction of the baseline fluctuations in the gem-based
alice tpc,” Journal of Instrumentation, vol. 18, no. 11, p. P11021, 2023.

[3] J. Adolfsson et al., “SAMPA chip: the new 32 channels ASIC for
the ALICE TPC and MCH upgrades,” Journal of Instrumentation,
vol. 12, no. 04, pp. C04 008–C04 008, apr 2017. [Online]. Available:
https://doi.org/10.1088%2F1748-0221%2F12%2F04%2Fc04008

[4] S. Barboza et al., “Sampa chip: a new asic for the alice tpc and mch
upgrades,” Journal of Instrumentation, vol. 11, no. 02, p. C02088, feb
2016.

[5] S. Barboza et al., “Sampa chip: a new asic for the alice tpc and mch
upgrades,” Journal of Instrumentation, vol. 11, no. 02, p. C02088, feb
2016. [Online]. Available: https://dx.doi.org/10.1088/1748-0221/11/02/
C02088

[6] S. Acharya et al., “Alice upgrades during the lhc long shutdown 2,”
Journal of Instrumentation, vol. 19, no. 05, p. P05062, may 2024.
[Online]. Available: https://dx.doi.org/10.1088/1748-0221/19/05/P05062

[7] B. C. de Souza Sanches, “An application specific signal processor for
gaseous detector systems in high energy physics experiment,” Doctorate
thesis, Escola Polit´ecnica, Universidade de S˜ao Paulo, 2021.

