In this experiment, I measure and compare power consumption of flip-flop registers in various FPGA devices. The lower numbers presumably mean better efficiency.
No attempt is made to explain the results, based on the differences in density, LE structure, manufacturing technology, process nanometers or what have you, just the numbers. Make of it what you will.
Also, registers are compared to registers and nothing else. No RAM, ROM, multipliers or whatever built-in primitives the devices might have.
For every device I compile the same VHDL program, which generates as many flip-flop registers as possible for that device, all forming up a chain. Every register in the chain passes its input to output at every tick, just like registers are supposed to. The chain is then fed with the half-rate pulse signal, which flips at every clock tick. Therefore, after the chain has flushed, every register will charge at tick N and discharge at tick N+1, releasing the accumulated energy as fast as possible.
I then measure the power drained from the USB power supply, subtract the idle wattage measured in reset state of the same program and divide by the number of registers in the chain.
The results are therefore measured in "microwatts per register" and theoretically correspond to energy consumed (released) by a single register, flipping at every clock tick.
The experiment is repeated three times with different clock frequencies, specifically 10, 25 and 100 megahertz, generated by whatever PLL machinery the device provides. One of the good signs is that the observed power consumption ends up growing close to linear with the frequency, just as you would expect in the lower frequency range, when heat dissipation is not a big issue.
This is how it looks like conceptually:
This is how it looks when synthesized (for a chain of 3 registers):
The simulation source is available in this repository, here is how the simulated waveform looks like, for a chain of 10 registers:
And here is how an actual physical output may look:
The following is the list of tested devices:
- Diymore, Altera Cyclone II (EP2C5T144C8)
- Zrtech, Altera Cyclone IV (EP4CE6E22C8N)
- Qmtech, Altera Cyclone V (5CEFA2F23I7N)
- BeMicro-Max-10, Altera Max 10 (10M08DAF484C8GES)
- Trenz-Electronic MAX1000, Altera Max 10 (10M16SAU169C8G)
- Trenz-Electronic CYC1000, Intel Cyclone 10 (10CL025YU256C8G)
- Micro-Nova Mercury, Spartan 3A (XC3S200A5VQ100)
- OHO-Elektronik GODIL50, Spartan 3E (XC3S500E4VQG100C)
- Numato Labs Mimas V2, Spartan 6 (XC6SLX9CSG324)
- Digilent Cora Z7, Zynq 7000, (XC7Z007S1CLG400C)
- Digilent Cmod S7, Spartan 7 (XC7S251CSGA225C)
- Digilent Cmod A7, Artix 7 (XC7A35T1CPG236C)
- Lattice MachXO2 Pico, Lattice Mach XO2 (LCMX02-1200ZE-1MG123I)
- Nandland Go Board, Lattice iCE40 (iCE40HX1K)
- Trenz-Electronic LXO2000, Lattice Mach XO2 (LCMXO2-4000HC-4QN84C)
- OrangeCrab, Lattice ECP5 (LFE5U-25F-8MG285C)
- Radiona ULX3S, Lattice ECP5 (LFE5U-85F-6BG381C)
- Tang Primer, Anlogic EG4 (EG4S20BG256)
- Runber, Gowin GW1N-4 (GW1N-UV4LQ144C6/I5)
- Trenz-Electronic TEC0117, Gowin GW1NR-9 (GW1NR-LV9QN88C6/I5)
- DK START, Gowin GW2A (LV18PG256C8I7)
- FireAnt, Efinix Trion T8 (T8F81)
- Trion T20, Efinix Trion T20 (T20F256)
- Trenz-Electronic SMF2000, Microsemi SmartFusion2 (M2S010-VF400)
For power measurement this device was used (reports as Ruideng AT35 v.1.7 as it boots).
Here is the final table with the collected data. The devices are grouped by manufacturer, and the different frequencies are represented by the vertical blocks.
The following is the graph built from the data. The 100 Mhz column is probably the most useful, the rest is there just to demonstrate the linearity.
Finally, here is the list of devices, in the ascending order of the measured power consumption.
№ | Family | Model | µW/reg @ 100MHz |
---|---|---|---|
1 | ECP5 | LFE5U-25F-8MG285C | 23.0 |
2 | Microsemi SF2 | M2S010-VF400 | 23.1 |
3 | ECP5 | LFE5U-85F-6BG381C | 25.1 |
4 | Zynq 7000 | XC7Z007S1CLG400C | 26.5 |
5 | Altera Cyclone V | 5CEFA2F23I7N | 31.6 |
6 | Altera Max 10 | 10M08DAF484C8GES | 33.4 |
7 | Intel Cyclone 10 | 10CL025YU256C8G | 36.6 |
8 | Anlogic EG4 | EG4S20BG256 | 37.4 |
9 | Trion T20 | T20F256 | 37.9 |
10 | Spartan 7 | XC7S251CSGA225C | 52.3 |
11 | Spartan 3E | XC3S500E4VQG100C | 57.0 |
12 | Artix 7 | XC7A35T1CPG236C | 67.2 |
13 | Altera Max 10 | 10M16SAU169C8G | 91.6 |
14 | iCE40 | iCE40HX1K | 23.0 x 4 = 92.0 |
15 | Mach XO2 | LCMXO2-4000HC-4QN84C | 96.1 |
16 | Trion T8 | T8F81 | 109.9 |
17 | Gowin GW2A | LV18PG256C8I7 | 114.2 |
18 | Altera Cyclone IV | EP4CE6E22C8N | 121.9 |
19 | Spartan 6 | XC6SLX9CSG324 | 122.6 |
20 | Gowin GW1NR-9 | GW1NR-LV9QN88C6/I5 | 130.5 |
21 | Altera Cyclone II | EP2C5T144C8 | 166.0 |
22 | Mach XO2 | LCMX02-1200ZE-1MG123I | 191.6 |
23 | Spartan 3A | XC3S200A5VQ100 | 195.6 |
24 | Gowin GW1N-4 | GW1N-UV4LQ144C6/I5 | (see note) 534.2 |
The Go Board is based on iCE40HX1K FPGA chip in VQ100 package, which doesn't have any PLLs. Therefore I could only run it at the 25 MHz the board oscillator provides.
For the hybrid chips, that contain both a CPU core and FPGA, the CPU is ignored, only the FPGA half is used.
And I absolutely cannot explain the results from the Runber device featuring Gowin 1N4 chip, it goes off the chart right away, for no apparent reason. Its two other Gowin siblings do just fine. Perhaps there is something wrong with the particular board, or the chip.