Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cheaper fpgas #35

Open
aep opened this issue Dec 13, 2020 · 16 comments
Open

cheaper fpgas #35

aep opened this issue Dec 13, 2020 · 16 comments

Comments

@aep
Copy link

aep commented Dec 13, 2020

I was wondering if much lower spec (and much cheaper) FGPAs would work.

The lattice ECP5 has 5gb/s serdes, so i guess that's not a good option.
How about Artix? the datasheet is a little confusing to me as someone who has never used xilinx.

" 211 Gb/s Serial Bandwidth" but only "6.6 Gb/s Transceiver Speed"

i guess 6gbs is still ok, assuming pcie is separate.

@alexforencich
Copy link
Member

ECP5 is too small, too slow, and doesn't have PCIe hard IP. Artix 7 could possibly be an option, though it would be limited to either 1 Gbps or require an external XAUI PHY, and I would have to port the PCIe DMA components to 7-series, which is not high on my priority list. Cyclone 10 GX is possibly a decent option as the serdes on that part support 10G natively. However, the PCIe hard IP is limited to gen 2 x4, which is unfortunate. TBH, Kintex UltraScale or UltraScale+ are probably the best "bang for the buck" parts (PCIe gen 3 x8 + native 10G and even 25G serdes), and they are already supported. Also, there are some used FPGA boards available that are capable of running Corundum (specifically VCU1525/BCU1525 - yes, they are about a grand or so, but that is an insane deal for a VU9P FPGA + PCIe gen 3x16 + dual QSFP28 + ~700 Gbps BW into DDR4). TBH, I might be more open to porting to Stratix V to support the DE5-Net or some of the surplus Catapult boards from Azure vs. something like ECP5 or Artix. Arria 10 and Stratix 10 DX are on the roadmap, once that's done then Stratix V would probably be more an issue of timing closure than anything else as it's an older and slower part.

At any rate, Corundum is intended for datacenter networking research where additional functionality is built on top of the core host interface and evaluated in a datacenter environment. So it's really only interesting for line rates 10G and higher and on FPGAs large enough to include additional functionality. If someone wanted to fund development for some of these lower-end parts, then that might be an option, but until then, if it's not something that I personally need for research purposes, it's probably not going to be supported.

I should also mention that I have no interest in producing hardware at this time. So even if there is a screaming deal on a particular part, if you can't buy it on a board in a PCIe form factor that also provides SFP+ interfaces or similar, it's not an interesting option. There was a kickstarter a while back for a PCIe form-factor board with a KU3P FPGA for like $500 which would have been insane, but then they changed gears and went mining-only with no IO to speak of. Such a waste of perfectly good FPGAs.

@aep
Copy link
Author

aep commented Dec 13, 2020

thanks for the long response.

Corundum is intended for datacenter networking research where additional functionality is built on top of the core host interface

yeah that's why i'm interested. But more from a production perspective since i actually run a datacenter company. A grand for a network card isn't competitive when you can get an entire amd epyc machine doing the same thing. Someone like azure has extremely high margins so they probably care more about scale than price efficiency, but we're a tiny shop with very different financials.

If someone wanted to fund development for some of these lower-end parts, then that might be an option

yes. very open to that.

I should also mention that I have no interest in producing hardware at this time.

i do make hardware, but the big xilinx are out of range for any commercial viability. You need to be a big corpo to get them at reasonable prices.

We do ARM & RISC-V servers, easily pushing 300Gb/s through a 1000 USD cluster. The great challenge is that there's no network fabric matching that price to performance. So that's why i'm researching if FGPAs might be the solution here.
Maybe this is just not a match, but happy to discuss further.

@alexforencich
Copy link
Member

research

Perhaps some of the functionality developed on top of Corundum will make it into the next generation of commercial, ASIC-based NICs. Or higher-level networking research will impact the design of switches, network stacks, etc. It's possibly not stable enough for production use as is, but this could be addressed if the project grows.

Anyway, if you're seriously interested in possibly using Corundum for something in production, that's certainly something that can be looked in to. Depending on what it is you have in mind, maybe even using PCIe is not the right choice - I have been working on a Zynq version of Corundum with an AXI interface instead of PCIe, for example, maybe something along those lines would make more sense, depending on what it is you're doing.

@alexforencich
Copy link
Member

Also, even if the big Xilinx parts are not economical, have you taken a more serious look at some of the lower-end parts, such as the Kintex UltraScale+ line? Like I said, the KU3P is already supported, and should be able to operate at around 50 Gbps. I'm not sure what sort of pricing you might be able to get from Xilinx for that part, but it's certainly going to be a lot better than anything Virtex, and you won't need any extra PHY chips to make it work.

@aep
Copy link
Author

aep commented Dec 14, 2020

Like I said, the KU3P is already supported,

yeah, KU3P is about 1 grand the chip alone, plus you need a ton of expensive power delivery.
that's more expensive than a cisco nexus, which can do 200gbps

Corundum with an AXI interface

neat!

@mithro
Copy link

mithro commented Dec 14, 2020

Have you thought about using something like LiteX (with LiteDRAM) + LitePCIe (and maybe LiteICLink for connecting multiple smaller FPGAs together?). These cores have wide support for everything from low end iCE40 parts up to high end VU19P parts. @enjoy-digital has been doing some excellent work bringing up cheap high end FPGA hardware from ex-bitcoin mining rigs (see https://twitter.com/enjoy_digital/status/1329744466907979778 for example). Totally understand if you want to keep all your own implementations of this stuff.

@mithro
Copy link

mithro commented Dec 14, 2020

(LiteX designs are also a core early target for the SymbiFlow project.)

@aep
Copy link
Author

aep commented Dec 14, 2020

thanks, but litex isn't really interesting for business strategy reasons.
unless i'm mistaken, liteeth doesnt support 10G anyway.

@vamposdecampos
Copy link

One of the things I'm interested in (once I get non-negative free time) is a version of corundum without the network ports, just the host PCIe logic (e.g. showing as two NICs connected back-to-back) -- for learning and experimenting on the PCIe interface and driver. I'm hoping such a contraption would run on lower-end boards, like the SQRL Acorns.

@alexforencich
Copy link
Member

alexforencich commented Dec 15, 2020

IIRC, the kickstarter with the KU3P managed to get pricing from Xilinx permitting a board price of around $500. Not terribly cheap, but much more reasonable than many of the alternatives. Have you looked at the Cyclone 10 at all? If all you need is a 10G port, accessible over PCIe, the Cyclone 10 GX supports PCIe gen 2 x4 and has 10G transceivers, so with that you could build a single-port 10G NIC with one of those. Anyway, if you just need a 10G NIC and no custom features, then commercial ASIC-based NICs are probably going to be more economical.

If you want two NICs back to back, that's trivial - the "core" corundum logic exposes AXI stream interfaces, you can easily cross-connect those internally and ignore the external ports. However, the SQRL acorn I believe is Artix 7, so I would need to port the PCIe DMA components to 7-series, which is currently not on the roadmap.

@alexforencich
Copy link
Member

Minor update to this: the DMA engine has been "split" into a core module + device-specific shim, so you would only need to write a shim for Artix 7 instead of a whole DMA engine. However, resource consumption of the overall design + timing closure would likely be a serious issue; timing closure is already a problem on Virtex 7, and the failures tend to be in the control logic and not the datapath.

@ohault
Copy link

ohault commented Aug 26, 2023

Minor update to this: the DMA engine has been "split" into a core module + device-specific shim, so you would only need to write a shim for Artix 7 instead of a whole DMA engine. However, resource consumption of the overall design + timing closure would likely be a serious issue; timing closure is already a problem on Virtex 7, and the failures tend to be in the control logic and not the datapath.

With this, could support for Artix 7 be included into the Corundum roadmap ?

@ohault
Copy link

ohault commented Aug 26, 2023

One of the things I'm interested in (once I get non-negative free time) is a version of corundum without the network ports, just the host PCIe logic (e.g. showing as two NICs connected back-to-back) -- for learning and experimenting on the PCIe interface and driver. I'm hoping such a contraption would run on lower-end boards, like the SQRL Acorns.

I created #114 with this in mind too.

@alexforencich
Copy link
Member

For Artix 7 support on the roadmap, that's probably not going to happen, unless maybe someone wants to fund it somehow. It's already hard enough to close timing on Virtex 7, and Artix 7 is significantly smaller and slower. Artix US+ might be a different story. I have been mulling over making a stripped down "corundum lite" that might work better for this sort of thing, but again the main problem is lack of time.

As far as the "no network ports" thing; that might be something I'll need to look in to in more detail. With the switch project coming along, it looks like one relatively common configuration is to have a PCIe link to control the switch, with additional Ethernet links to ports on the control SoC for network traffic. So, it might make sense to have a way to configure the design for that sort of use case.

@ohault
Copy link

ohault commented Apr 24, 2024

Any news about support of others FPGAs and lower speed Ethernet flavors?

@alexforencich
Copy link
Member

lower-end parts: no updates aside from the fact that we're working on a design around a K26 SoM as part of OCP TAP (2x 10G + 2 lanes of PCIe). Lower speed Ethernet: supporting running 10G-capable serdes at 1G is on the to-do list, and this will likely also enable running at 1G directly, but this is also complicated by the fact that we want to support white rabbit and all of this has to work even when the reference clock isn't 156.25 MHz (which rules out CPLLs).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants