# XLSとOpenLaneを使ったコードからの半導体設計

```
Copyright 2021 Google LLC.
SPDX-License-Identifier: Apache-2.0
```
このノートブックでは
- 高位合成ツールキットの[XLS](https://google.github.io/xls/)での設計
- [OpenLane](https://github.com/The-OpenROAD-Project/OpenLane/)による、RTLからGDSの生成
- オープンソースの[SKY130](https://github.com/google/skywater-pdk/) PDK向けのチップ設計

を取り扱います。ソフトウェア開発に近い形でのハードウェア設計を体験してみましょう。

In [None]:
%pip install -q https://github.com/conda-incubator/condacolab/archive/28521d7c5c494dd6377bb072d97592e30c44609c.tar.gz
#@title conda環境のインストール {display-mode: "form"}
#@markdown - ▷ ボタンをクリックすると、conda-edaのセットアップが開始されます。
#@markdown - Click the ▷ button to setup the digital design environment based on [conda-eda](https://github.com/hdl/conda-eda).

openlane_version = 'latest' #@param {type:"string"}
open_pdks_version = 'latest' #@param {type:"string"}
xls_version = 'latest' #@param {type:"string"}

if openlane_version == 'latest':
  openlane_version = ''
if open_pdks_version == 'latest':
  open_pdks_version = ''
if xls_version == 'latest':
  xls_version = ''

import os
import pathlib
import sys

!curl -Ls https://micro.mamba.pm/api/micromamba/linux-64/latest | tar -xj bin/micromamba
conda_prefix_path = pathlib.Path('conda-env')
site_package_path = conda_prefix_path / 'lib/python3.7/site-packages'
sys.path.append(str(site_package_path.resolve()))
CONDA_PREFIX = str(conda_prefix_path.resolve())
PATH = os.environ['PATH']
LD_LIBRARY_PATH = os.environ.get('LD_LIBRARY_PATH', '')
%env CONDA_PREFIX={CONDA_PREFIX}
%env PATH={CONDA_PREFIX}/bin:{PATH}
%env LD_LIBRARY_PATH={CONDA_PREFIX}/lib:{LD_LIBRARY_PATH}
!bin/micromamba create --yes --prefix $CONDA_PREFIX
!echo 'python ==3.7*' >> {CONDA_PREFIX}/conda-meta/pinned
!CI=0 bin/micromamba install --quiet --yes --prefix $CONDA_PREFIX \
                     --channel litex-hub \
                     --channel main \
                     openlane={openlane_version} \
                     open_pdks.sky130a={open_pdks_version} \
                     xls={xls_version}
!curl -L -O https://patch-diff.githubusercontent.com/raw/The-OpenROAD-Project/OpenLane/pull/1503.patch
!patch -p1 -d conda-env/share/openlane < 1503.patch
!curl -L -O https://github.com/google/xls/archive/refs/heads/main.tar.gz
!tar --strip-components=1 -xf main.tar.gz xls-main/xls/dslx/stdlib/ xls-main/xls/modules/
def2gds_mag = '''gds read $::env(CONDA_PREFIX)/share/pdk/sky130A/libs.ref/sky130_fd_sc_hd/gds/sky130_fd_sc_hd.gds
lef read $::env(CONDA_PREFIX)/share/pdk/sky130A/libs.ref/sky130_fd_sc_hd/techlef/sky130_fd_sc_hd__nom.tlef
lef read $::env(CONDA_PREFIX)/share/pdk/sky130A/libs.ref/sky130_fd_sc_hd/lef/sky130_fd_sc_hd.lef
def read $::env(IN_DEF)
gds write $::env(IN_DEF).gds'''
with open('def2gds.mag', 'w') as f:
  f.write(def2gds_mag)
!git clone https://github.com/mbalestrini/GDS2glTF.git
!python -m pip install -r GDS2glTF/requirements.txt
!git clone https://github.com/proppy/gds_viewer.git
import jinja2
gds_viewer = jinja2.Environment(loader=jinja2.FileSystemLoader('gds_viewer')).get_template('viewer.html')

  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
  Building wheel for condacolab (pyproject.toml) ... [?25l[?25hdone
[0menv: CONDA_PREFIX=/content/conda-env
env: PATH=/content/conda-env/bin:/content/conda-env/bin:/opt/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/tools/node/bin:/tools/google-cloud-sdk/bin
env: LD_LIBRARY_PATH=/content/conda-env/lib:/content/conda-env/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64

                                           __
          __  ______ ___  ____ _____ ___  / /_  ____ _
         / / / / __ `__ \/ __ `/ __ `__ \/ __ \/ __ `/
        / /_/ / / / / / / /_/ / / / / / / /_/ / /_/ /
       / .___/_/ /_/ /_/\__,_/_/ /_/ /_/_.___/\__,_/
      /_/

Empty environment created at prefix: /content/conda-env
  % Total    % Received % Xferd  Average Speed   Time    Time 

## 高位合成(HLS)を使った設計

[XLS](https://google.github.io/xls/)は高位合成(High Level Synthesis: HLS)のツールチェインを提供します。XLSを用いることで、ソフトウェア開発に近い方法でハードウェアの設計が可能です。

[DSLX](https://google.github.io/xls/dslx_reference/) はハードウェア設計向けのデータフロー志向・関数型のドメイン特化言語(Domain Specific Language: DSL)です。DSLXで記述した高レベルな機能の設計から、具体的なハードウェアデザインを生成できます。

![img](https://google.github.io/xls/images/xls_stack_diagram.png)

### DSLX

以下の例を通して次のようなDSLXの機能を紹介します。
- 基本的な言語仕様と[文法](https://google.github.io/xls/dslx_reference/#expressions)
- [ユニットテスト](https://google.github.io/xls/dslx_reference/#unit-tests)
- [パラメトリックな関数](https://google.github.io/xls/dslx_reference/#parametric-functions)の定義
- [標準ライブラリ](https://google.github.io/xls/dslx_std/)とモジュールの[インポート](https://google.github.io/xls/dslx_reference/#imports).


In [None]:
%%bash -c 'cat > threeBodySymp.x; interpreter_main threeBodySymp.x --alsologtostderr'


import std
import float32
type F32 = float32::F32;
import xls.modules.fp.apfloat_add_2
import xls.modules.fp.apfloat_sub_2
import xls.modules.fp.apfloat_mul_2

fn is_zero(x: F32) -> u1 {
  x.bexp == u8:0
}

// XLSのGithubのthird_partyディレクトリにあった割り算関数
pub fn fpdiv_2x32(x: F32, y: F32) -> F32 {
  // 1. Get and expand mantissas.
  let x_fraction = (x.fraction as u64) | u64:0x80_0000;
  let y_fraction = (y.fraction as u64) | u64:0x80_0000;

  // 1a. Flush subnorms to 0.
  let x_fraction = if x.bexp == u8:0 { u64:0 } else { x_fraction };
  let y_fraction = if y.bexp == u8:0 { u64:0 } else { y_fraction };

  // 2. Subtract non-biased exponents.
  //  - Remove the bias from the exponents, subtract them, then restore the bias.
  //  - Simplifies from
  //      (A - 127) - (B - 127) + 127 = exp
  //    to
  //      A + B + 127 = exp
  let exp = (x.bexp as s10) - (y.bexp as s10) + s10:0x7f;

  // 3. Shift numerator and adjust exponent.
  let exp = if x_fraction < y_fraction { exp - s10:1 } else { exp };
  let x_fraction = if x_fraction < y_fraction { x_fraction << u64:31 } else { x_fraction << u64:30 };

  // 4. Divide integer mantissas.
  let fraction = std::iterative_div(x_fraction, y_fraction) as u32;

  // 5. Account for remainder / error.
  let fraction_has_bit_in_six_lsbs = fraction[0:6] != u6:0;
  let remainder_detected = (y_fraction * fraction as u64) != x_fraction;
  let set_fraction_lsb = !fraction_has_bit_in_six_lsbs && remainder_detected;
  let fraction = if set_fraction_lsb { fraction | u32:1 } else { fraction };

  // 6. Check rounding conditions.
  // We use nearest, half to even rounding.
  // - We round down if less than 1/2 way between values, i.e.
  // - We round up if we're more than 1/2 way
  // - If halfway, then we round whichever direction makes the
  //   result even.
  let round_bits = fraction[0:7];
  let is_half_way = round_bits[-1:] & (round_bits[:-1] == u6:0);
  let greater_than_half_way = round_bits[-1:] & (round_bits[:-1] != u6:0);

  // We're done with the extra precision bits now, so shift the
  // fraction into its almost-final width, adding one extra
  // bit for potential rounding overflow.
  let fraction = (fraction >> u32:7) as u23;
  let fraction = fraction as u24;
  let do_round_up = greater_than_half_way || (is_half_way & fraction[0:1]);
  let fraction = if do_round_up { fraction + u24:1 } else { fraction };

  // Adjust the exponent if we overflowed during rounding.
  // After checking for subnormals, we don't need the sign bit anymore.
  let exp = if fraction[-1:] { exp + s10:1 } else { exp };
  let is_subnormal = exp <= s10:0;

  // We're done - except for special cases...
  let result_sign = x.sign ^ y.sign;
  let result_exp = exp as u9;
  let result_fraction = fraction as u23;

  // 6. Special cases!
  // - Subnormals: flush to 0.
  let result_exp = if is_subnormal { u9:0 } else { result_exp };
  let result_fraction = if is_subnormal { u23:0 } else { result_fraction };

  // - Overflow infinites. Exp to 255, clear fraction.
  let result_fraction = if result_exp < u9:0xff { result_fraction } else { u23:0} ;
  let result_exp = if result_exp < u9:0xff { result_exp as u8 } else { u8:0xff };

  // - If the denominator is 0 or the numerator is infinity,
  // the result is infinity.
  let divide_by_zero = is_zero(y);
  let divide_inf = float32::is_inf(x);
  let is_result_inf = divide_by_zero || divide_inf;
  let result_exp = if is_result_inf { u8:0xff } else { result_exp };
  let result_fraction = if is_result_inf { u23:0 } else { result_fraction };

  // - If the numerator is 0 or the denominator is infinity,
  // the result is 0.
  let divide_by_inf = float32::is_inf(y);
  let divide_zero = is_zero(x);
  let is_result_zero = divide_by_inf || divide_zero;
  let result_exp = if is_result_zero { u8:0 } else { result_exp };
  let result_fraction = if is_result_zero { u23:0 } else { result_fraction };

  // Preliminary result until we check for NaN output.
  let result = F32 { sign: result_sign, bexp: result_exp, fraction: result_fraction };

  // - NaNs. NaN cases have highest priority, so we handle them last.
  //  If the numerator or denominator is NaN, the result is NaN.
  //  If we divide inf / inf or 0 / 0 , the result ist NaN.
  let has_nan_arg = float32::is_nan(x) || float32::is_nan(y);
  let zero_divides_zero = is_zero(x) && is_zero(y);
  let inf_divides_inf = float32::is_inf(x) && float32::is_inf(y);
  let is_result_nan = has_nan_arg || zero_divides_zero || inf_divides_inf;
  let result = if is_result_nan { float32::qnan() } else { result };

  result
}

// XLSのGithubのthird_partyディレクトリにあったsqrt関数
fn fpsqrt_32(x: F32) -> F32 {
  // Flush subnormal input.
  let x = float32::subnormals_to_zero(x);

  let exp = float32::unbiased_exponent(x);

  let scaled_fixed_point_x = u1:0 ++ u8:1 ++ x.fraction;
  // If odd exp, double x to make it even.
  let scaled_fixed_point_x = if (exp as u8)[0:1] { scaled_fixed_point_x << u32:1 }
                             else { scaled_fixed_point_x };
  // exp = exp / 2, exponent of square root
  let exp = exp >> u8:1;

  // Generate sqrt(x) bit by bit.
  let scaled_fixed_point_x = scaled_fixed_point_x << u32:1;

  // s is scaled version of the square root calculated down to a
  let (scaled_fixed_point_x, sqrt_in_progress, _) =
    for (idx, (scaled_fixed_point_x,
               sqrt_in_progress,
               shifting_bit_mask)):
        (u32, (u32,
               u32,
               u32))
        in range(u32:0, u32:23 + u32:2) {

    let temp = (sqrt_in_progress << u32:1) | shifting_bit_mask;

    // Would be nice to have dslx if-blocks that can desugar
    // down to something like this automatically...
    let (sqrt_in_progress, scaled_fixed_point_x) =
    if temp <= scaled_fixed_point_x {
      (sqrt_in_progress | shifting_bit_mask,
      scaled_fixed_point_x - temp)
    } else {
      (sqrt_in_progress, scaled_fixed_point_x)
    };

    let scaled_fixed_point_x = scaled_fixed_point_x << u32:1;
    let shifting_bit_mask = shifting_bit_mask >> u32:1;

    (scaled_fixed_point_x,
     sqrt_in_progress,
     shifting_bit_mask)

  } ((scaled_fixed_point_x,      // scaled_fixed_point_x
      u32:0,                     // sqrt_in_progress
      u32:1 << u32:23 + u32:1)); // shifting_bit_mask

  // Final rounding.
  let sqrt_in_progress = if scaled_fixed_point_x != u32:0 {
    sqrt_in_progress + (u31:0 ++ sqrt_in_progress[0:1])
  } else {
    sqrt_in_progress
  };
  let scaled_fixed_point_x = (sqrt_in_progress >> u32:1) +
           ((float32::bias(exp - s8:1)) as u32 << u32:23);
  let result = float32::unflatten(scaled_fixed_point_x);

  // I don't *think* it is possible to underflow / have a subnormal result
  // here. In order to have a subnormal result, x would have to be
  // subnormal with x << sqrt(x). In this case, x would have been flushed
  // to 0. x==0 is handled below as a special case.

  // Special cases.
  // sqrt(inf) -> inf, sqrt(-inf) -> NaN (handled below along
  // with other negative numbers).
  let result = if float32::is_inf(x) { x } else { result };
  // sqrt(x < 0) -> NaN
  let result = if x.sign == u1:1 { float32::qnan() } else { result };
  // sqrt(NaN) -> NaN.
  let result = if float32::is_nan(x) { float32::qnan() } else { result };
  // x == -0 returns x rather than NaN.
  let result = if float32::is_zero_or_subnormal(x) { x } else { result };
  result
}

// 浮動小数の符号反転
fn invertSignFp(a: F32) -> F32 {
  let b: F32 = F32 {sign: !a.sign[0:1],
            bexp: a.bexp[0:8],
            fraction: a.fraction[0:23]};
  b
}

// 1つのq/p_uw[t+1]を求める関数
fn updateQP (ma: F32, mb: F32, mc: F32, dt: F32, aqx: F32, apx: F32, aqSubBqX: F32, cqSubAqX: F32, rab3: F32, rca3: F32) -> F32[2] {
  let mbAqSubBqX: F32 = apfloat_mul_2::apfloat_mul_2<u32:8, u32:23>(mb, aqSubBqX); // mb * (aq.x - bq.x)
  let fabSlaMaX: F32 = fpdiv_2x32(mbAqSubBqX, rab3); // mb * (aq.x - bq.x) / rab^3
  let mcCqSubAqX: F32 = apfloat_mul_2::apfloat_mul_2<u32:8, u32:23>(mc, cqSubAqX); // mc * (cq.x - aq.x)
  let facSlaMaX: F32 = fpdiv_2x32(mcCqSubAqX, rca3); // mc * (cq.x - aq.x) / rca^3
  let faSlaMaX: F32 = apfloat_sub_2::apfloat_sub_2<u32:8, u32:23>(facSlaMaX, fabSlaMaX); // -mb(aq.x-bq.x)/rab^3 -mc(aq.x-cq.x)/rca^3
  let apxDelta: F32 = apfloat_mul_2::apfloat_mul_2<u32:8, u32:23>(faSlaMaX, dt);
  let apxNew: F32 = apfloat_add_2::add<u32:8, u32:23>(apx, apxDelta); // ap.x[i+1] = ap.x[i] + Fa/ma*dt
  let aqxDelta: F32 = apfloat_mul_2::apfloat_mul_2<u32:8, u32:23>(apxNew, dt);
  let aqxNew: F32 = apfloat_add_2::add<u32:8, u32:23>(aqx, aqxDelta); // aq.x[i+1] = aq.x[i] + ap.x[i+1]*dt

  let newQP: F32[2] = [aqxNew, apxNew];
  newQP
}

// 引数から次ステップの全部のq/p_uw[t+1]を求める関数．適宜コメントアウトを外して実行
fn threeBodySymp (aqx: F32, aqy: F32, apx: F32, apy: F32, bqx: F32, bqy: F32, bpx: F32, bpy: F32, cqx: F32, cqy: F32, cpx: F32, cpy: F32) -> F32[2][6] {
  
  let ma: F32 = F32 {sign: u1: 0,
            bexp: u8:0b10000000,
            fraction: u23:0b10000000000000000000000}; // ma = 3.0
  let mb: F32 = F32 {sign: u1: 0,
            bexp: u8:0b10000001,
            fraction: u23:0b10000000000000000000000}; // mb = 4.0
  let mc: F32 = F32 {sign: u1: 0,
            bexp: u8:0b10000001,
            fraction: u23:0b01000000000000000000000}; // mc = 5.0
  let dt: F32 = F32 {sign: u1:0, 
            bexp: u8:0b01110001,
            fraction: u23:0b10100011011011100010111}; // dt = 0.0001

  // rab
  //let aqSubBqX: F32 = apfloat_sub_2::apfloat_sub_2<u32:8, u32:23>(aqx, bqx); // aq.x - bq.x
  //let aqSubBqX2: F32 = apfloat_mul_2::apfloat_mul_2<u32:8, u32:23>(aqSubBqX, aqSubBqX); // (aq.x - bq.x)^2
  //let aqSubBqY: F32 = apfloat_sub_2::apfloat_sub_2<u32:8, u32:23>(aqy, bqy); // aq.y - bq.y
  //let aqSubBqY2: F32 = apfloat_mul_2::apfloat_mul_2<u32:8, u32:23>(aqSubBqY, aqSubBqY); // (aq.y - bq.y)^2
  //let rab2: F32 = apfloat_add_2::add<u32:8, u32:23>(aqSubBqX2, aqSubBqY2); // rab^2 = (aq.x - bq.x)^2 + (aq.y - bq.y)^2
  //let rab: F32 = fpsqrt_32(rab2); // rab = sqrt(rab^2)
  //let rab3: F32 = apfloat_mul_2::apfloat_mul_2<u32:8, u32:23>(rab, rab2); // rab^3 = rab * rab^2

  // rbc
  //let bqSubCqX: F32 = apfloat_sub_2::apfloat_sub_2<u32:8, u32:23>(bqx, cqx); // bq.x - cq.x
  //let bqSubCqX2: F32 = apfloat_mul_2::apfloat_mul_2<u32:8, u32:23>(bqSubCqX, bqSubCqX); // (bq.x - cq.x)^2
  //let bqSubCqY: F32 = apfloat_sub_2::apfloat_sub_2<u32:8, u32:23>(bqy, cqy); // bq.y - cq.y
  //let bqSubCqY2: F32 = apfloat_mul_2::apfloat_mul_2<u32:8, u32:23>(bqSubCqY, bqSubCqY); // (bq.y - cq.y)^2
  //let rbc2: F32 = apfloat_add_2::add<u32:8, u32:23>(bqSubCqX2, bqSubCqY2); // rbc^2 = (bq.x - cq.x)^2 + (bq.y - cq.y)^2
  //let rbc: F32 = fpsqrt_32(rbc2); // rbc = sqrt(rbc^2)
  //let rbc3: F32 = apfloat_mul_2::apfloat_mul_2<u32:8, u32:23>(rbc, rbc2); // rbc^3 = rbc * rbc^2

  // rca
  //let cqSubAqX: F32 = apfloat_sub_2::apfloat_sub_2<u32:8, u32:23>(cqx, aqx); // cq.x - aq.x
  //let cqSubAqX2: F32 = apfloat_mul_2::apfloat_mul_2<u32:8, u32:23>(cqSubAqX, cqSubAqX); // (cq.x - aq.x)^2
  //let cqSubAqY: F32 = apfloat_sub_2::apfloat_sub_2<u32:8, u32:23>(cqy, aqy); // cq.y - aq.y
  //let cqSubAqY2: F32 = apfloat_mul_2::apfloat_mul_2<u32:8, u32:23>(cqSubAqY, cqSubAqY); // (cq.y - aq.y)^2
  //let rca2: F32 = apfloat_add_2::add<u32:8, u32:23>(cqSubAqX2, cqSubAqY2); // rca^2 = (cq.x - aq.x)^2 + (cq.y - aq.y)^2
  //let rca: F32 = fpsqrt_32(rca2); // rca = sqrt(rca^2)
  //let rca3: F32 = apfloat_mul_2::apfloat_mul_2<u32:8, u32:23>(rca, rca2); // rca^3 = rca * rca^2

  let newAQPW: F32[2] = updateQP(ma, mb, mc, dt, aqx, apx, bqx, bpx, cqx, cpx);
  //let newAQPX: F32[2] = updateQP(ma, mb, mc, dt, aqx, apx, aqSubBqX, cqSubAqX, rab3, rca3);
  //let newAQPY: F32[2] = updateQP(ma, mb, mc, dt, aqy, apy, aqSubBqY, cqSubAqY, rab3, rca3);
  //let newBQPX: F32[2] = updateQP(ma, mb, mc, dt, bqx, bpx, bqSubCqX, aqSubBqX, rbc3, rab3);
  //let newBQPY: F32[2] = updateQP(ma, mb, mc, dt, bqy, bpy, bqSubCqY, aqSubBqY, rbc3, rab3);
  //let newCQPX: F32[2] = updateQP(ma, mb, mc, dt, cqx, cpx, cqSubAqX, bqSubCqX, rca3, rbc3);
  //let newCQPY: F32[2] = updateQP(ma, mb, mc, dt, cqy, cpy, cqSubAqY, bqSubCqY, rca3, rbc3);

  // 本来はこのupdate後の各座標・速度を返す
  //let newQP: F32[2][6] = [newAQPX, newAQPY, newBQPX, newBQPY, newCQPX, newCQPY];
  let A: F32[2] = newAQPW;
  let newQP: F32[2][6] = [A, A, A, A, A, A];

  // デバッグ用のprintfによる結果の表示
  //let sign: u1 = rab.sign[0:1];
  //let exp: u8 = rab.bexp[0:8];
  //let frac: u23 = rab.fraction[0:23];
  //let print = trace_fmt!("aqSubBqX sign: {}, exp: {}, frac: {}", sign, exp, frac);

  newQP
}


fn user_module(io_in: u8) -> u8 {
  let five = F32 {sign: u1:0, bexp:
                  float32::bias(s8:2),
                  fraction: u2:1 ++ u21:0};
  // 三体問題関数を実行．引数はテストの値
  let result: F32[2][6] = threeBodySymp(five, zero, zero, zero, zero, zero, zero, zero, zero, zero, zero, zero);
  let _: u8 = result[0][0].bexp[0:8];
  _
}

#[test]
fn test() {
  let _ = user_module(u8:0b0001_0001);
}

[ RUN UNITTEST  ] test
libunwind: __unw_add_dynamic_fde: bad fde: FDE is really a CIE
libunwind: __unw_add_dynamic_fde: bad fde: FDE is really a CIE
libunwind: __unw_add_dynamic_fde: bad fde: FDE is really a CIE
W0103 03:48:23.611107   23875 run_routines.cc:122] Could not find __itok__threeBodySymp__test function for JIT comparison
[            OK ]


## ハードウェアIRへの変換
DSLXのコードからハードウェア回路合成により適した形式
[XLS IR](https://google.github.io/xls/ir_semantics/)に変換しましょう。
XLS IRは回路合成に特化した純粋なデータフロー志向の中間表現(Intermediate Respresentation: IR)です。

In [None]:
#//!ir_converter_main --top=user_module user_module.x > user_module.ir
#//!opt_main user_module.ir > user_module_opt.ir
#//!cat user_module_opt.ir

!ir_converter_main --top=threeBodySymp threeBodySymp.x > threeBodySymp.ir
!opt_main threeBodySymp.ir > threeBodySymp_opt.ir
!cat threeBodySymp_opt.ir

package threeBodySymp

file_number 0 "fake_file.x"

top fn __threeBodySymp__threeBodySymp(aqx: (bits[1], bits[8], bits[23]), aqy: (bits[1], bits[8], bits[23]), apx: (bits[1], bits[8], bits[23]), apy: (bits[1], bits[8], bits[23]), bqx: (bits[1], bits[8], bits[23]), bqy: (bits[1], bits[8], bits[23]), bpx: (bits[1], bits[8], bits[23]), bpy: (bits[1], bits[8], bits[23]), cqx: (bits[1], bits[8], bits[23]), cqy: (bits[1], bits[8], bits[23]), cpx: (bits[1], bits[8], bits[23]), cpy: (bits[1], bits[8], bits[23])) -> (bits[1], bits[8], bits[23])[2][6] {
  literal.45478: bits[1] = literal(value=0, id=45478, pos=[(0,207,52)])
  bpx_fraction__1: bits[23] = tuple_index(bpx, index=2, id=10996, pos=[(0,62,21)])
  bpx_bexp__5: bits[8] = tuple_index(bpx, index=1, id=10985, pos=[(0,30,3)])
  literal.45468: bits[8] = literal(value=0, id=45468, pos=[(0,30,23)])
  literal.10993: bits[1] = literal(value=0, id=10993, pos=[(0,207,52)])
  bqx_fraction__1: bits[23] = tuple_index(bqx, index=2, id=10975, pos=[(0,6

## RTLの生成

XLS codegenを使うことで、XLS IRから回路合成とシミュレーションに使用する
(System) Verilog [RTL](https://en.wikipedia.org/wiki/Register-transfer_level)を生成できます。

[Verilog](https://en.wikipedia.org/wiki/Verilog)は回路設計で広く用いられているので、XLSで生成されたVerilogコードは、各種の設計フローや他の設計と統合することができます。

In [None]:
!codegen_main --use_system_verilog=false --module_name=threeBodySymp --generator=combinational threeBodySymp_opt.ir > threeBodySymp.v
!cat threeBodySymp.v

module threeBodySymp(
  input wire [31:0] aqx,
  input wire [31:0] aqy,
  input wire [31:0] apx,
  input wire [31:0] apy,
  input wire [31:0] bqx,
  input wire [31:0] bqy,
  input wire [31:0] bpx,
  input wire [31:0] bpy,
  input wire [31:0] cqx,
  input wire [31:0] cqy,
  input wire [31:0] cpx,
  input wire [31:0] cpy,
  output wire [383:0] out
);
  // lint_off MULTIPLY
  function automatic [47:0] umul48b_24b_x_24b (input reg [23:0] lhs, input reg [23:0] rhs);
    begin
      umul48b_24b_x_24b = lhs * rhs;
    end
  endfunction
  // lint_on MULTIPLY
  // lint_off MULTIPLY
  function automatic [55:0] umul56b_24b_x_32b (input reg [23:0] lhs, input reg [31:0] rhs);
    begin
      umul56b_24b_x_32b = lhs * rhs;
    end
  endfunction
  // lint_on MULTIPLY
  wire [22:0] bpx_fraction__1;
  wire [7:0] bpx_bexp__5;
  wire [22:0] bqx_fraction__1;
  wire [7:0] bqx_bexp__5;
  wire ne_49633;
  wire ne_49637;
  wire [23:0] bpx_fraction__2;
  wire [23:0] bqx_fraction__2;
  wire [8:0] add_49647;
  w

## OpenLaneフローの実行

[OpenLane](https://openlane.readthedocs.io/en/latest/)
は[RTL](https://en.wikipedia.org/wiki/Register-transfer_level)
から
[GDSII](https://en.wikipedia.org/wiki/GDSII)
を生成する自動化されたフローです。

このフローは
[OpenROAD](https://theopenroadproject.org/),
[Yosys](https://yosyshq.net/yosys/), [Magic](http://www.opencircuitdesign.com/magic/), [Netgen](http://opencircuitdesign.com/netgen/)
といったコンポーネントと
[open source PDKs](https://github.com/google/open-source-pdks)向けのデザイン探索や最適化のためのカスタムスクリプトからなります。

フローの概要については下記の図を参考にしてください。

![img](https://openlane.readthedocs.io/en/latest/_images/flow_v1.png)

#### OpenLaneの設定

[ドキュメント](https://openlane.readthedocs.io/en/latest/reference/configuration.html)

In [None]:
%%writefile config.json
{
    "DESIGN_NAME": "threeBodySymp",
    "VERILOG_FILES": "dir::threeBodySymp.v",
    "CLOCK_TREE_SYNTH": false,
    "CLOCK_PERIOD": 100000,
    "CLOCK_PORT": "clk",
    "CLOCK_NET": "ref::$CLOCK_PORT",
    "FP_SIZING": "absolute",
    "DIE_AREA": "0 0 50 50",
    "PL_TARGET_DENSITY": 0.30,
    "FP_PIN_ORDER_CFG": "dir::pin_order.cfg"
}

Overwriting config.json


In [None]:
%%writefile pin_order.cfg
#BUS_SORT

#W
io_in.*

#E
out.*

Overwriting pin_order.cfg


### 回路合成

- 入力: [RTL](https://en.wikipedia.org/wiki/Register-transfer_level) (Verilog)
- 出力: 素子情報付きの[ネットリスト](https://en.wikipedia.org/wiki/Netlist) (Verilog)
- メトリック: セルの数と [タイミング収束](https://en.wikipedia.org/wiki/Timing_closure) の推定値

[ドキュメント](https://openlane.readthedocs.io/en/latest/usage/hardening_macros.html#synthesis)

In [None]:
%env PDK=sky130A
!flow.tcl -design . -to synthesis

env: PDK=sky130A
OpenLane 2022.11.12_3_g1298859-conda
All rights reserved. (c) 2020-2022 Efabless Corporation and contributors.
Available under the Apache License, version 2.0. See the LICENSE file for more details.

The version of open_pdks used in building the PDK does not match the version OpenLane was tested on (installed: 3696eca015bc64afa69c385dddaae931d9da3496, tested: 0059588eebfc704681dc2368bd1d33d96281d10f)
This may introduce some issues. You may want to re-install the PDK by invoking `make pdk`.
The version of magic used in building the PDK does not match the version OpenLane was tested on (installed: be40825e9aadc1bed858801572bef0415444b516, tested: 94daf986ab9aa94a9ae2ac3539fa5def9bd2a1ac)
This may introduce some issues. You may want to re-install the PDK by invoking `make pdk`.[39m
[36m[INFO]: Using configuration in 'config.json'...[39m
[36m[INFO]: PDK Root: /content/conda-env/share/pdk[39m
[36m[INFO]: Process Design Kit: sky130A[39m
[36m[INFO]: Standard Cell L

env: PDK=sky130A
OpenLane 2022.11.12_3_g1298859-conda
All rights reserved. (c) 2020-2022 Efabless Corporation and contributors.
Available under the Apache License, version 2.0. See the LICENSE file for more details.

[WARNING]: OpenLane may not function properly: not enough values to unpack (expected 3, got 1)
The version of open_pdks used in building the PDK does not match the version OpenLane was tested on (installed: 3696eca015bc64afa69c385dddaae931d9da3496, tested: 0059588eebfc704681dc2368bd1d33d96281d10f)
This may introduce some issues. You may want to re-install the PDK by invoking `make pdk`.
The version of magic used in building the PDK does not match the version OpenLane was tested on (installed: be40825e9aadc1bed858801572bef0415444b516, tested: 94daf986ab9aa94a9ae2ac3539fa5def9bd2a1ac)
This may introduce some issues. You may want to re-install the PDK by invoking `make pdk`.
[INFO]: Using configuration in 'config.json'...
[INFO]: PDK Root: /content/conda-env/share/pdk
[INFO]: Process Design Kit: sky130A
[INFO]: Standard Cell Library: sky130_fd_sc_hd
[INFO]: Optimization Standard Cell Library: sky130_fd_sc_hd
[INFO]: Run Directory: /content/runs/RUN_2023.01.02_13.23.38
[INFO]: Preparing LEF files for the nom corner...
[INFO]: Preparing LEF files for the min corner...
[INFO]: Preparing LEF files for the max corner...
[STEP 1]
[INFO]: Running Synthesis (log: runs/RUN_2023.01.02_13.23.38/logs/synthesis/1-synthesis.log)...
[ERROR]: during executing yosys script /content/conda-env/share/openlane/scripts/yosys/synth.tcl
[ERROR]: Log: runs/RUN_2023.01.02_13.23.38/logs/synthesis/1-synthesis.log
[ERROR]: Last 10 lines:
      Found 1 activation_patterns using ctrl signal $auto$rtlil.cc:2375:ReduceOr$28189.
      Forbidden control signals for this pair of cells: { $or$/content/threeBodySymp.v:22925$13454_Y $or$/content/threeBodySymp.v:22922$13447_Y $ge$/content/threeBodySymp.v:22704$12998_Y $ge$/content/threeBodySymp.v:22701$12994_Y $xor$/content/threeBodySymp.v:22610$12184_Y $xor$/content/threeBodySymp.v:22609$12181_Y $xor$/content/threeBodySymp.v:22608$12178_Y $xor$/content/threeBodySymp.v:22607$12175_Y $ge$/content/threeBodySymp.v:22578$12124_Y $ge$/content/threeBodySymp.v:22577$12121_Y $ge$/content/threeBodySymp.v:22575$12115_Y $ge$/content/threeBodySymp.v:22574$12112_Y $ge$/content/threeBodySymp.v:22524$12012_Y $ge$/content/threeBodySymp.v:22523$12009_Y $ge$/content/threeBodySymp.v:22521$11998_Y $ge$/content/threeBodySymp.v:22520$11995_Y $or$/content/threeBodySymp.v:22460$11893_Y $or$/content/threeBodySymp.v:22458$11888_Y $le$/content/threeBodySymp.v:22303$11677_Y $le$/content/threeBodySymp.v:22301$11675_Y $or$/content/threeBodySymp.v:22200$11605_Y $or$/content/threeBodySymp.v:22197$11600_Y $ge$/content/threeBodySymp.v:21960$11156_Y $ge$/content/threeBodySymp.v:21957$11152_Y \is_operand_inf__20 \is_result_nan__53 \is_operand_inf__17 \is_result_nan__44 \do_round_up__62 \do_round_up__53 \fraction__260 [28] \fraction__215 [28] \ugt_212947 \ugt_212940 \is_result_nan__52 \is_result_nan__43 \fraction__259 [23] \fraction__214 [23] \do_round_up__52 \do_round_up__43 \do_round_up__51 \do_round_up__42 \fraction__251 [28] \fraction__206 [28] }
      Activation pattern for cell $shr$/content/threeBodySymp.v:21832$10279: $auto$rtlil.cc:2375:ReduceOr$28171 = 1'0
      Activation pattern for cell $shr$/content/threeBodySymp.v:21828$10268: $auto$rtlil.cc:2375:ReduceOr$28189 = 1'0
      Size of SAT problem: 0 cells, 1238743 variables, 3409619 clauses
      According to the SAT solver this pair of cells can not be shared.
      Model from SAT solver: { $auto$rtlil.cc:2375:ReduceOr$28189 $auto$rtlil.cc:2375:ReduceOr$28171 } = 2'00
    Analyzing resource sharing with $shr$/content/threeBodySymp.v:21827$10265 ($shr):
      Found 1 activation_patterns using ctrl signal $auto$rtlil.cc:2375:ReduceOr$28198.
      Forbidden control signals for this pair of cells: { $or$/content/threeBodySymp.v:22925$13454_Y $or$/content/threeBodySymp.v:22922$13447_Y $ge$/content/threeBodySymp.v:22704$12998_Y $ge$/content/threeBodySymp.v:22701$12994_Y $xor$/content/threeBodySymp.v:22610$12184_Y $xor$/content/threeBodySymp.v:22609$12181_Y $xor$/content/threeBodySymp.v:22608$12178_Y $xor$/content/threeBodySymp.v:226child killed: kill signal

[ERROR]: Creating issue reproducible...
[INFO]: Saving runtime environment...
OpenLane TCL Issue Packager

EFABLESS CORPORATION AND ALL AUTHORS OF THE OPENLANE PROJECT SHALL NOT BE HELD
LIABLE FOR ANY LEAKS THAT MAY OCCUR TO ANY PROPRIETARY DATA AS A RESULT OF USING
THIS SCRIPT. THIS SCRIPT IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OR
CONDITIONS OF ANY KIND.

BY USING THIS SCRIPT, YOU ACKNOWLEDGE THAT YOU FULLY UNDERSTAND THIS DISCLAIMER
AND ALL IT ENTAILS.

Parsing config file(s)…
Setting up /content/runs/RUN_2023.01.02_13.23.38/issue_reproducible…
Done.
[INFO]: Reproducible packaged: Please tarball and upload 'runs/RUN_2023.01.02_13.23.38/issue_reproducible' if you're going to submit an issue.
[INFO]: Saving current set of views in 'runs/RUN_2023.01.02_13.23.38/results/final'...
[INFO]: Generating final set of reports...
[INFO]: Created manufacturability report at 'runs/RUN_2023.01.02_13.23.38/reports/manufacturability.rpt'.
[INFO]: Created metrics report at 'runs/RUN_2023.01.02_13.23.38/reports/metrics.csv'.
[INFO]: Saving runtime environment...
[ERROR]: Flow failed.

In [None]:
#@title ネットリストのプレビュー {display-mode: "form"}

import graphviz
import pathlib

dots = sorted(pathlib.Path('runs').glob('*/tmp/synthesis/post_techmap.dot'))
print(dots)
dot = graphviz.Source.from_file(dots[-1])
dot.engine = 'dot'
#dot # 図にすると遅くて終わらない．上のdotファイルだけ取り出せばXORとかの個数は数えられる

[PosixPath('runs/RUN_2023.01.03_02.18.38/tmp/synthesis/post_techmap.dot'), PosixPath('runs/RUN_2023.01.03_03.49.01/tmp/synthesis/post_techmap.dot')]
