Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XDMA: read file: Unknown error 512 when start 1MB C2H transfer(stream mode) on Ubuntu 18.04.1(4.15.0-88-generic) #54

Open
woshixiaohuya opened this issue Mar 21, 2020 · 22 comments

Comments

@woshixiaohuya
Copy link

woshixiaohuya commented Mar 21, 2020

The detailed description is https://forums.xilinx.com/t5/PCIe-and-CPM/XDMA-linux-driver-from-git-C2H-transfor-1MB-fail-crash-in-Stream/m-p/1087752#M16120.

Ubuntu18.04.1 , x64 OS, Vivado 2018.3.1

I git the master branch code at 2020-03-17 from the respoitory. My XDMA was used in AXI-Stream mode.
The test cmd is './dma_from_device -d /dev/xdma0_c2h_0 -f data/output_file.bin -s transfersize -c 1'

If I transfer 4KB(transfersize ) one time, the driver work correctly.

When I transfer 1MB(transfersize ) one time, the driver return error.

I read the libxdma.h and found that some parameters would be related to the transfersize: XDMA_TRANSFER_MAX_DESC and CYCLIC_RX_PAGES_MAX.

I changed XDMA_TRANSFER_MAX_DESC to 4096 and CYCLIC_RX_PAGES_MAX to 1024, then I test the transfer of 1MB but the error still exist:

/dev/xdma0_c2h_0, R off 0x0, 0xffffffffffffffff != 0x100000.
read file: Unknown error 512

@woshixiaohuya woshixiaohuya changed the title read file: Unknown error 512 when start a C2H transfer(stream mode) on Ubuntu 18.04.1(4.15.0-88-generic) read file: Unknown error 512 when start 1MB C2H transfer(stream mode) on Ubuntu 18.04.1(4.15.0-88-generic) Mar 21, 2020
@lesjokolat
Copy link

@woshixiaohuya
Copy link
Author

woshixiaohuya commented Mar 22, 2020

does the solution at bottom here help you?
https://forums.xilinx.com/t5/PCIe-and-CPM/XDMA-driver-test-scripts-error-messages/td-p/912101

It doesn't work. I think that the offset(0x80000000) solution is mainly used for AXI-Memorymapped mode. My XDMA mode is AXI-Stream.
But I hve tried that and same error. The offset 0x80000000 was added in read_to_buffer() and write_from_buffer() for one parameter(uint64_t base) refer to another question (https://forums.xilinx.com/t5/PCIe-and-CPM/debug-the-driver-of-IP-PCIE-with-DMA/m-p/914480#M12599).

@lesjokolat
Copy link

lesjokolat commented Mar 22, 2020

https://forums.xilinx.com/t5/PCIe-and-CPM/C2H-Streaming-XDMA-Linux-Driver-Broken/td-p/833977

anything in this one?

See this fix done as well.

fd5f152

@woshixiaohuya
Copy link
Author

woshixiaohuya commented Mar 22, 2020

https://forums.xilinx.com/t5/PCIe-and-CPM/C2H-Streaming-XDMA-Linux-Driver-Broken/td-p/833977

anything in this one?

See this fix done as well.

fd5f152

Thank you. I have saw that link which has no useful methods.
I will try the fix as you mentioned. I think that it would be useful.
I git the source code at 2020-03-17, but my code is different from the fix fd5f152 which is commited 26 days ago. For example, XDMA_TRANSFER_MAX_DESC is still 2048 and PAGE_SIZE_X86 can't be found in 'https://github.com/Xilinx/dma_ip_drivers/blob/master/XDMA/linux-kernel/libxdma/libxdma.h'.

@woshixiaohuya
Copy link
Author

https://forums.xilinx.com/t5/PCIe-and-CPM/C2H-Streaming-XDMA-Linux-Driver-Broken/td-p/833977

anything in this one?

See this fix done as well.

fd5f152

Hello! I have tried the [fd5f152] but it didn't work. I got the same error.

/dev/xdma0_c2h_0, R off 0x0, 0xffffffffffffffff != 0x100000.
read file: Unknown error 512

I have read some source code dma_from_device.c and dma_utils.c (https://github.com/Xilinx/dma_ip_drivers/tree/master/XDMA/linux-kernel/tools). It seems that in the read_to_buffer() and write_from_buffer() the read() or write() return -1(0xffffffffffffffff) cause "rc != bytes". But why the two funtions return -1 I don't know.

@lesjokolat
Copy link

Im not a dev but i compared to older 65444 code of dma_from_device.c

/* file argument given? /
if ((out_fd >= 0) & (no_write == 0)) {
rc = write_from_buffer(ofname, out_fd, buffer,
size, i
size);

is it me or does this duplicate Size variable?
size or i*size

should it just be one or the other?

@woshixiaohuya
Copy link
Author

woshixiaohuya commented Mar 23, 2020

Im not a dev but i compared to older 65444 code of dma_from_device.c

/* file argument given? _/ if ((out_fd >= 0) & (no_write == 0)) { rc = write_from_buffer(ofname, out_fd, buffer, size, i_size);

is it me or does this duplicate Size variable?
size or i*size

should it just be one or the other?

I analyzer the old and the new and There have no problem ethier size or i*size. The difference is that memset () and posix_memalign().

@woshixiaohuya woshixiaohuya changed the title read file: Unknown error 512 when start 1MB C2H transfer(stream mode) on Ubuntu 18.04.1(4.15.0-88-generic) XDMA: read file: Unknown error 512 when start 1MB C2H transfer(stream mode) on Ubuntu 18.04.1(4.15.0-88-generic) Mar 28, 2020
@woshixiaohuya
Copy link
Author

@bhathaway would you give this issue some attention?

@lesjokolat
Copy link

Hi again I am not a dev but I saw on AWS they had similar issues and made some references to 4096 buffer size.

the search results also pointed to some other code section you experts might make sense of. I tried to show first link of where this git has the memalign. Hope it helps.

From:
https://github.com/Xilinx/dma_ip_drivers/blob/8b8c70b697f049649d5fa99be9c6bc4302d89ac9/XDMA/linux-kernel/tools/dma_from_device.c

posix_memalign((void **)&allocated, 4096 /*alignment */ , size + 4096);
if (!allocated) {
fprintf(stderr, "OOM %lu.\n", size + 4096);
rc = -ENOMEM;
goto out;

From:
https://github.com/aws/aws-fpga/search?q=4096&unscoped_q=4096

//Read from specified address, specified size within a bank
//Caller's responsibility to do sanity checks. No sanity checks done here
int readBank(std::ofstream& aOutFile, unsigned long long aStartAddr, unsigned long long aSize) {
char buf = 0;
unsigned long long blockSize = 0x20000;
if (posix_memalign((void
*)&buf, 4096, blockSize))
return -1;
std::memset(buf, 0, blockSize);

  if (posix_memalign(&buf, 4096, blockSize+1))//Last is for termination char
    return -1;
  if (posix_memalign(&bufPattern, 4096, blockSize+1)) {//Last is for termination char
    free(buf);
    return -1;

From:
https://github.com/aws/aws-fpga/blob/7f1e76765766f579d3a767bedf669019d50342f3/hdk/common/software/src/xdma_utils.c

h2c_desc_table = (XDMA_DESC *)memalign(4096, 4096); // allocate 4k aligned to a 4k boundary

sv_map_host_memory(h2c_desc_table);

@hmaarrfk
Copy link

I'm just going to say that @bhathaway is not affiliated with Xilinx and is one of their customers. You should probaly ping Xilinx harder and not a fellow user.

Unless we decide to create our own fork, which we could probably maintain with little effort.

@woshixiaohuya
Copy link
Author

I'm just going to say that @bhathaway is not affiliated with Xilinx and is one of their customers. You should probaly ping Xilinx harder and not a fellow user.

Unless we decide to create our own fork, which we could probably maintain with little effort.

Sorry, I thought you were the developer of this driver. I made a wrong judgment.

@hmaarrfk
Copy link

no, you can see that their PR is in limbo just like the rest of our issues ;)

XIlinx basically refuses to interact here. It is likely because their management banned them from using unauthorized channels to communicate. So just keep pinging them on their forums or through your FAE (but they likely aren't PCIe experts).

@woshixiaohuya
Copy link
Author

no, you can see that their PR is in limbo just like the rest of our issues ;)

XIlinx basically refuses to interact here. It is likely because their management banned them from using unauthorized channels to communicate. So just keep pinging them on their forums or through your FAE (but they likely aren't PCIe experts).

Okay,thanks very much!

@woshixiaohuya
Copy link
Author

@karenx-xilinx I guess you are the developer of Xilinx, would you give this issue some attention? This issue has been posted on Xilinx community forum but without reply.

@woshixiaohuya
Copy link
Author

woshixiaohuya commented Apr 8, 2020

I do some more tests. Stream mode, Vivado 2018.3.1, Ubuntu 18.04.1(Kernel 4.15.0-91) , use dma_from_device cmd.
1MB one count: ./dma_from_device -d /dev/xdma0_c2h_0 -s 0x100000 -c 1
4KB 4096 count: ./dma_from_device -d /dev/xdma0_c2h_0 -s 0x1000 -c 4096

Item Settings result
1 XDMA IP enable MSI-X,disable Legacy and MSI;change load_driver.sh:insmod ../xdma/xdma.ko interrupt_mode=0 C2H 1MB one count:read file: Unknown error 512(dmesg log4);ILA show that datagenerator send 128KB to XDMA IP
2 XDMA IP enable Legacy,disable MSI-X and MSI;change load_driver.sh:insmod ../xdma/xdma.ko interrupt_mode=2 C2H 1MB one count:read file: Unknown error 512;ILA show that datagenerator send 128KB to XDMA IP
3 XDMA IP enable MSI,disable MSI-X and Legacy;change load_driver.sh:insmod ../xdma/xdma.ko interrupt_mode=1 C2H 1MB one count:read file: Unknown error 512;ILA show that datagenerator send 424KB to XDMA IP.  C2H 4KB 4096 count:read file: Unknown error 512(dmesg log5);

dmesg log4
[ 134.039435] xdma:cdev_xvc_init: xcdev 0x00000000fe39ecdf, bar 0, offset 0x40000.
[ 197.852038] xdma:xdma_xfer_submit: xfer 0x00000000272bd6a7,1048576, s 0x1 timed out, ep 0x100000.
[ 197.852044] xdma:engine_reg_dump: 0-C2H0-ST: ioread32(0x00000000f15633a4) = 0x1fc18006 (id).
[ 197.852047] xdma:engine_reg_dump: 0-C2H0-ST: ioread32(0x0000000019e2d255) = 0x00000001 (status).
[ 197.852050] xdma:engine_reg_dump: 0-C2H0-ST: ioread32(0x0000000066484305) = 0x00f83e1f (control)
[ 197.852053] xdma:engine_reg_dump: 0-C2H0-ST: ioread32(0x00000000c0a9725c) = 0x33040000 (first_desc_lo)
[ 197.852055] xdma:engine_reg_dump: 0-C2H0-ST: ioread32(0x0000000009e84815) = 0x00000000 (first_desc_hi)
[ 197.852058] xdma:engine_reg_dump: 0-C2H0-ST: ioread32(0x0000000035736d8f) = 0x0000000f (first_desc_adjacent).
[ 197.852061] xdma:engine_reg_dump: 0-C2H0-ST: ioread32(0x00000000391dd373) = 0x00000020 (completed_desc_count).
[ 197.852064] xdma:engine_reg_dump: 0-C2H0-ST: ioread32(0x00000000fee909ee) = 0x00f83e1e (interrupt_enable_mask)
[ 197.852068] xdma:engine_status_dump: SG engine 0-C2H0-ST status: 0x00000001: BUSY
[ 197.852071] xdma:transfer_abort: abort transfer 0x00000000272bd6a7, desc 256, engine desc queued 0.

dmesg log5
[ 109.467648] xdma:remove_one: pdev 0x00000000751b6686, xdev 0x00000000d3d106be, 0x000000004c5360aa.
[ 109.467648] xdma:xpdev_free: xpdev 0x00000000d3d106be, destroy_interfaces, xdev 0x000000004c5360aa.
[ 109.468437] xdma:xpdev_free: xpdev 0x00000000d3d106be, xdev 0x000000004c5360aa xdma_device_close.
[ 109.549240] xdma:xdma_mod_init: Xilinx XDMA Reference Driver xdma v2019.2.51
[ 109.549242] xdma:xdma_mod_init: desc_blen_max: 0xfffffff/268435455, sgdma_timeout: 10 sec.
[ 109.549259] xdma:xdma_threads_create: xdma_threads_create
[ 109.549666] xdma:xdma_device_open: xdma device 0000:01:00.0, 0x00000000751b6686.
[ 109.549809] xdma:map_single_bar: BAR0 at 0x53200000 mapped at 0x000000007f6986ce, length=1048576(/1048576)
[ 109.549828] xdma:map_single_bar: BAR1 at 0x53300000 mapped at 0x00000000e026e2ec, length=65536(/65536)
[ 109.549830] xdma:map_bars: config bar 1, pos 1.
[ 109.549830] xdma:identify_bars: 2 BARs: config 1, user 0, bypass -1.
[ 109.549899] xdma:pci_keep_intx_enabled: 0000:01:00.0: clear INTX_DISABLE, 0x406 -> 0x6.
[ 109.549911] xdma:probe_one: 0000:01:00.0 xdma0, pdev 0x00000000751b6686, xdev 0x0000000070206296, 0x00000000c37d28a4, usr 16, ch 2,2.
[ 109.550289] xdma:cdev_xvc_init: xcdev 0x00000000562761cb, bar 0, offset 0x40000.

@lesjokolat
Copy link

@woshixiaohuya
Copy link
Author

See this one
https://forums.xilinx.com/t5/PCIe-and-CPM/memory-mapped-XDMA-c2h-from-Stream-FIFO/m-p/1109864#M16539

512 error and xdma stream

In the reference forums it's about memory mapped mode, but this issue was happened on stream mode

@hudaweidavid
Copy link

I had the same problem, If the amount of data exceeds 8K, it will time out

@woshixiaohuya
Copy link
Author

I had the same problem, If the amount of data exceeds 8K, it will time out

This problem already exists a long time. The developers make more effort on AXI memory mapped mode. That troubles us that using AXI Stream mode.

@hmaarrfk
Copy link

Hello,

My name is Mark Harfouche. I am not affiliated with Xilinx in any way. Over the
years of using QDMA, I've been wanted better community organization.

I've created a fork of dma_ip_drivers which I intend to maintain and work with the
community at large to improve.

The fork can be found https://github.com/hmaarrfk/dma_ip_drivers

For now, I am stating the main goals of the repository in
hmaarrfk#2

If you are interested in working together, feel free to open an issue or PR to
my fork.

Best,

Mark

@mayureshw
Copy link

mayureshw commented Mar 29, 2024

Hello,

My name is Mark Harfouche. I am not affiliated with Xilinx in any way. Over the years of using QDMA, I've been wanted better community organization.

I've created a fork of dma_ip_drivers which I intend to maintain and work with the community at large to improve.

The fork can be found https://github.com/hmaarrfk/dma_ip_drivers

For now, I am stating the main goals of the repository in hmaarrfk#2

If you are interested in working together, feel free to open an issue or PR to my fork.

Best,

Mark

I had the same problem, If the amount of data exceeds 8K, it will time out

This problem already exists a long time. The developers make more effort on AXI memory mapped mode. That troubles us that using AXI Stream mode.

Just ran into the same issue and search led to this page. Indeed, in stream mode, as the write size touches 8k, it just blocks.

However if I am writing using "dd" command it does not seem to block. Any idea, why?

E.g. this doesn't block, even for sizes up to 1MB

C2H=/dev/xdma0_c2h_0
H2C=/dev/xdma0_h2c_0 
SRC=/dev/zero
SINK=/dev/null
COUNT=1
BS=8192

dd bs=$BS count=$COUNT if=$C2H of=$SINK &
dd bs=$BS count=$COUNT if=$SRC of=$H2C
wait

UPDATE: The issue triggers for a write with exactly 8191 bytes, up to 8190 bytes the writes are fine.

@mayureshw
Copy link

mayureshw commented Mar 29, 2024

If this could help, strace shows use of writev, which returns with EINVAL error when the write size touches 8191. Below 8191 it makes use of write() which succeeds.

writev(4, [{iov_base=NULL, iov_len=0}, {iov_base="6+A;\32\26?\0_y[\10g\25\fM\33pR\23^D--~\f]\3<n\0\n"..., iov_len=8192}], 2) = -1 EINVAL (Invalid argument)

UPDATE: I was using C++ ofstream. Instead I switched to C's, write and read calls and then I am not facing the limit of 8192.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants