Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processing SVF files directly #23

Open
gsteiert opened this issue Apr 21, 2024 · 9 comments
Open

Processing SVF files directly #23

gsteiert opened this issue Apr 21, 2024 · 9 comments

Comments

@gsteiert
Copy link

Loading devices from SVF files with openFPGALoader take forever.
How difficult would it be to take the SVF parsing from openFPGALoader and run it locally on the RP2040?
It would be nice to be able to send a .SVF file into the RP2040 through a virtual UART and process it locally.
Could we use the SVF parsing from openFPGALoader, or should we look for one that already runs on an MCU?

@phdussud
Copy link
Owner

Can you provide some concrete data? How does it compare with bin programming? What JTAG frequency do you use? Do you know where the bottleneck is? Can you share your SVF file? It turns out that the author of openFGPALoader wasn't all positive about enabling SVF support for every FPGA targets because SVF is inherently slow compared to bin programming. The reason is that some delays are typically inserted in places due to programming delays imposed by the hardware. In bin mode, openFPGALoader loops around a read register status until the device is ready. SVF do not allow this so a conservative delay is typically introduced which is always slower than the probe loop.

@gsteiert
Copy link
Author

The board I am using (MAX10 10M08 Evaluation Kit) does does not have bin support for comparison, but the SVF programming takes several minutes for a small device. I was running at the fastest 16Mbps supported by pick-dirtyJtag. There may be some potential optimizations like skipping verification of the image.
The way SVF works, it has to check the TDO data for each transaction. I expect it would be quicker to check the data on the MCU than to send the TDO data back to openFPGALoader for processing. The other advantage could be the elimination of openFPGALoader. You could send the SVF file directly to the UART and do all the processing on the pico board.

I may attempt this myself. I am mostly wondering if you think the openFPGALoader SVF parsing could be ported to the RP2040 since you seem to be familiar with both. Or would you look for another SVF implementation to port?

@phdussud
Copy link
Owner

phdussud commented Apr 22, 2024

I am guessing you are right. SVF checking TDO may be the reason it is slow. However, there must be a generation mode where the generated SVF does not do this. For sure, this shortened Lattice file does not and loads 300KB in a couple of seconds.
bitstream-shortened.zip
As far as SVF parsers go I only know the one in openFPGALoader and it is C++ based. You will have to add the right C++ support libraries. It isn't anything that I would be interested in merging back into this project because it is too specific of a case to be interesting to most users. I would be glad to direct people who could benefit from it to your repo. Good luck!

@luyi1888
Copy link
Contributor

I'm not use FPGA at all. But i am facing the same situation. I'm using the OpenOCD, In my case, for read every 4bytes memory, At least, It need send 2 command queues to Pico. I guess it should be 1 for TAP_STATE_MOVE + write IR and 1 for TAP_STATE_MOVE + read DR. The OpenOCD not submit read command in bulk.

In the screenshot, you can see Pico need about ~4ms to processing every command queue. It should explain why dump speed is 0.15kb/s.
Of course, I am using the VMware and the code on Pico is not optimized. But, it will not help too much even the problem is solved. Because of USB send frame every 1 ms. 4ms -> 1ms, 0.15kb -> 0.6kb, not enough.

cjtag

@luyi1888
Copy link
Contributor

luyi1888 commented May 23, 2024

Besides of make openFPGALoader/OpenOCD running on Pico, I think the most generic way is use MPU to do this. So it no need for modify OpenOCD to really submit command in bulk. Just make a JTAG adapter driver as usual.
ARM core running Linux and OpenOCD, and make MCU core as I/O processor. Use shared memory to exchange data between ARM and MCU. MCU use GPIO bit-bang and SPI to do JTAG. Like the original dirtyJTAG.
I think the shared memory will eliminate the processing/transfer time of every command queue.

I'm already do some test on Rockchip RV1106, MCU can do bit-bang at 1 MHz. Compare to ST/TI, the board is Pico sized, cheap and easy to get.

@luyi1888
Copy link
Contributor

How about J-Link and other commercial debugger to handle this?
If I use OpenOCD + J-Link, the same result as Pico?
J-Link GDB Server, huge improvement? I guess they really submit command in bulk.

@phdussud
Copy link
Owner

About Jlink and the Jlink GDB server. The microprocessor in the Jlink device handles all of the chatty traffic of the debug protocol. It only sends the results to the gdb server on the USB line.
The reason you get so poor USB utilization is that you need a USB turnaround (send, the receive) every 64 bits of communication between OpenOCD and the adapter. OpenFPGALoader send binary without the need to have a turnaround and I see that the USB bus (FS) is almost totally used up when I use pico dirtyJtag

@phdussud
Copy link
Owner

About Jlink and the Jlink GDB server. The microprocessor in the Jlink device handles all of the chatty traffic of the debug protocol. It only sends the results to the gdb server on the USB line.
The reason you get so poor USB utilization is that you need a USB turnaround (send, the receive) every 64 bits of communication between OpenOCD and the adapter. OpenFPGALoader send binary without the need to have a turnaround and you see the USB bus is totally used up

@luyi1888
Copy link
Contributor

Got it, So the things like Black Magic Probe will be fastest. Thank you for your explanation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants