# M17 for Pynq

This project explores the direct transmission and reception of M17 radio
on VHF and UHF.  This transmitter is based on work done on an arbitrary
clock generator (or NCO) for Pynq in 2019.

This ACG work was inspired by two things.  First, I was playing around with
an Adafruit Si5351A module, trying to make a quadrature clock for testing
a Quadrature Sampling Detector (QSD), also known as a
[Tayloe Mixer](https://wparc.us/presentations/SDR-2-19-2013/Tayloe_mixer_x3a.pdf),
and noticed that the phase and duty cycle change based on frequency.  I have
no idea if this causes a problem at the HF and low VHF frequencies of
interest (6m - 160m amateur radio bands), but it seemed less than ideal.

Second was discovering [an article](https://zipcpu.com/blog/2019/06/28/genclk.html)
by Dan Gisselquist [@ZipCPU](https://twitter.com/zipcpu) on implementing an
arbitrary clock generator for an Artix-7 board.  I had been thinking about
doing such a thing but I'm an FPGA newbie and didn't really know where to
start.  Most of what I needed to implement this for Pynq is available in Vivado
IP Integrator.  The remaining bits were trivial to implement in HLS.

The NCO uses a high-speed (for the Zynq-7020) SERDES output to generate a
very rough clock signal.  This is then fed into a clock input and cleaned
up with a PLL, providing a very stable clock.  The PLL has a wide lock range.
As long as we stay within the bounds of the PLL's input range, we can create
a very accurate and clean clock signal.

This clock is then used to drive another SERDES pin that outputs a constant
010101... signal using DDR at 4x the input frequency.  The maximum output
frequency is 900MHz; the maximum TX frequency is half that.

This means we can cover up to the top of the 70cm band.

So, let's get started.

## Project Organization

This project is organized into 4 key components.

 1. This top-level document.
 2. The HLS code to implement the fractional divider.
 3. The Vivado IP Integrator project and block design.
 4. The Pynq code, implemented as [another notebook](Pynq/acg-pynq.ipynb).

# Table of Contents

# Requirements

As a software engineer, I always like to start with the requirements of a project
before getting started so I know what I am aiming for.  Since this is an exploratory
project, the requirements are more along the lines of "see if this will work" but
lets see what I am aiming for.

I am considering creating a low-power amateur radio transceiver, starting with a
receiver and eventually progressing to a transmitter, capable of 160m - 6m
(1.8-54.0MHz), using a QSD.  This requires two clock outputs in quadrature
(90 degrees phase offset).  We would like tuning precision to be within 1Hz over
the whole range, but we can live with less precision if necessary.  We would like
phase jitter to be below 100ps and duty cycle to be 50% +/- 0.5%.


# Fractional Clock Divider

We are going to generate a 32-bit fractional clock divider.  To get the precision
I am aiming for, I need only 24 bits, but old habits die hard.  And the example I
am basing this off of used 32 bits.

The code is pretty straight-forward.  Our interface takes a increment and
forever generates a clock based on this value. To disable it, set the increment to 0.
The increment is used to update an accumulator the clock (mostly).

The output sample rate $r$ is 800Msps.  To calculate the increment $i$  for the
desired frequency $f$ :

$i = \frac{(2^{32} r)}{f}$

----


```C++
#include <ap_int.h>

constexpr size_t DIVIDER=32;
constexpr size_t UPSAMPLE=8;

typedef ap_uint<DIVIDER> delay_type;
typedef ap_uint<UPSAMPLE> output_type;

output_type acg_top(delay_type inc);
```

The top-level function accepts the increment and returns up-sampled data to generate
the clock via an 8-bit DDR OSERDES interface.  We generate 8 output bits every clock
cycle.  We do this by creating 8 accumulators (0..7) and adding inc \* n to these, then
taking the high bit of each.

```C++
#include "acg_top.hpp"

output_type acg_top(delay_type inc)
{
#pragma HLS INTERFACE s_axilite port=inc
#pragma HLS INTERFACE ap_ctrl_none port=return
#pragma HLS latency max=0

	static ap_uint<DIVIDER> accumulator[UPSAMPLE];
	ap_uint<DIVIDER> delay[UPSAMPLE];
	output_type clock_out;

    // Generate the delay values by calculating the n delays.
	delay[0] = inc;
	delay[1] = inc << 1;
	delay[3] = inc << 2;
	delay[7] = inc << 3;
	delay[2] = delay[1] + delay[0];
	delay[4] = delay[3] + delay[0];
	delay[5] = delay[1] + delay[3];
	delay[6] = delay[7] - delay[0];

    // Add the increments to the base accumulator.
	accumulator[1] = accumulator[0] + delay[0];
	accumulator[2] = accumulator[0] + delay[1];
	accumulator[3] = accumulator[0] + delay[2];
	accumulator[4] = accumulator[0] + delay[3];
	accumulator[5] = accumulator[0] + delay[4];
	accumulator[6] = accumulator[0] + delay[5];
	accumulator[7] = accumulator[0] + delay[6];
	accumulator[0] = accumulator[0] + delay[7];

    // Grab the top bit of each accumulator and pack it into the output
    // byte.  Order here is important for proper serialization.  Note
    // that 0 is the largest.
	clock_out[7] = accumulator[0][DIVIDER - 1];
	clock_out[6] = accumulator[7][DIVIDER - 1];
	clock_out[5] = accumulator[6][DIVIDER - 1];
	clock_out[4] = accumulator[5][DIVIDER - 1];
	clock_out[3] = accumulator[4][DIVIDER - 1];
	clock_out[2] = accumulator[3][DIVIDER - 1];
	clock_out[1] = accumulator[2][DIVIDER - 1];
	clock_out[0] = accumulator[1][DIVIDER - 1];

	return clock_out;
}
```

That's it for the fractional clock divider.  8 bits are output each clock cycle.
The frequency can be changed on the fly via the AXI4lite interface by writing to
the inc_V variable.

# IP Integrator

To start the block design, we need to set up our board constraints.  We need 5 pins:

 1. An LED to indicate the PLL is locked.
 1. The SERDES output.  In an ideal world, this would be the same as the PLL input.
 1. The PLL input.  This must be a clock capable pin.
 1. Two pins for the quadrature clock outputs.
 
Lucky for us, PMOD A on the Pynq-Z2 works perfectly for our needs here.  We set up
the LED on one of the LEDs, and the rest are on PMOD A.  These are also available
on the Rasperry Pi connector.

```
set_property -dict { PACKAGE_PIN R14   IOSTANDARD LVCMOS33 } [get_ports { pll_locked }]; #IO_L6N_T0_VREF_34 Sch=led[0]
set_property -dict { PACKAGE_PIN Y18   IOSTANDARD LVCMOS33 SLEW FAST} [get_ports { synth_clk[0] }]; #IO_L17P_T2_34 Sch=ja_p[1]
set_property -dict { PACKAGE_PIN U18   IOSTANDARD LVCMOS33 } [get_ports { pll_in }]; #IO_L12P_T1_MRCC_34 Sch=ja_p[3]
set_property -dict { PACKAGE_PIN Y16   IOSTANDARD LVCMOS33 SLEW FAST} [get_ports { clk_i }]; #IO_L7P_T1_34 Sch=ja_p[2]
set_property -dict { PACKAGE_PIN Y17   IOSTANDARD LVCMOS33 SLEW FAST} [get_ports { clk_q }]; #IO_L7N_T1_34 Sch=ja_n[2]

```

The block design in IP integrator looks like this:

![acg-block-design.png](acg-block-design.png)