Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvement in SPI #27

Open
tgiacchi opened this issue Jan 12, 2024 · 22 comments
Open

Performance improvement in SPI #27

tgiacchi opened this issue Jan 12, 2024 · 22 comments

Comments

@tgiacchi
Copy link

Chaning the spi_set_format call in my_spi_init to use SPI_CPHA_1 seems to bring about a 10% improvement in performance.

@carlk3
Copy link
Owner

carlk3 commented Jan 13, 2024

I'm surprised that works at all. ChaN says

SPI mode 0 (CPHA=0, CPOL=0) is the proper setting to control MMC/SDC, but mode 3 (CPHA=1, CPOL=1) also works as well in most case.

but you're using mode 1? He doesn't cite his source, unfortunately. I looked at SD Specifications, Part 1, Physical Layer Specification, Version 3.01, February 18, 2010, and I couldn't find any mention of the SPI mode, and the bus timing diagrams are omitted in the simplified (free) versions of the SD Physical Layer specification. The Simplified Specifications are a subset of the complete SD Specifications, which I don't have. If you have access to the full Specifications, you might want to check whether SPI mode 1 operation is compliant.

@tgiacchi
Copy link
Author

tgiacchi commented Jan 13, 2024 via email

@tgiacchi
Copy link
Author

tgiacchi commented Jan 13, 2024 via email

@carlk3
Copy link
Owner

carlk3 commented Jan 13, 2024

I made the change based on looking at the rp2040 datasheet and scope views. I originally did it for the noOs version of the code to get that little extra performance I needed.

How are you measuring the performance improvement? I can't imagine how changing the phase could significantly change the throughput unless you are also increasing the baud rate.

@tgiacchi
Copy link
Author

tgiacchi commented Jan 13, 2024 via email

@carlk3
Copy link
Owner

carlk3 commented Jan 13, 2024

No baud rate increase. Writing data to the file and get usec time at start and end. Pretty simple. I am running at 25mhz exactly. Tested on 5 different cards from several different manufacturers. When I find some free time I'll hook up the scope again and do a capture and send it along.

Do you have a theory on how changing the clock phase could improve the data transfer rate?

There are many variables that could affect the timing. (See Performance Tuning Tips). For example, the file might span a segment on the SD card. Are you formatting the cards between tests?

@tgiacchi
Copy link
Author

tgiacchi commented Jan 13, 2024 via email

@carlk3
Copy link
Owner

carlk3 commented Jan 14, 2024

The ph0 drops the clock for a longer period.

I'm not sure what you mean by drops the clock. The clock signal is low for a longer part of the cycle? Does the frequency change? Does the card indicate "busy" for less time?

Writing data to the file and get usec time at start and end.

How much data are you writing? Does big_file_test (which you can run from command_line show a significant difference for, say, a 30 MB file?

@tgiacchi
Copy link
Author

tgiacchi commented Jan 14, 2024 via email

@carlk3
Copy link
Owner

carlk3 commented Jan 15, 2024

Yes, clock is low for a longer period.

How does having the clock is low for a longer period improve the performance? Is the overall clock cycle period shorter? I can't understand how it can be moving data faster unless the frequency goes up or somehow there's less host wait time (normally due to card busy time).

@tgiacchi
Copy link
Author

tgiacchi commented Jan 15, 2024 via email

@carlk3
Copy link
Owner

carlk3 commented Jan 15, 2024

Ah, so the frequency does change. What if you raise the baud_rate in ph0 instead of switching to ph1?

@tgiacchi
Copy link
Author

tgiacchi commented Jan 15, 2024 via email

@carlk3
Copy link
Owner

carlk3 commented Jan 15, 2024

I don't understand how the frequency can change. From RP2040 Datasheet:

4.4.3.6.1. Bit rate generation
The serial bit rate is derived by dividing down the input clock, SSPCLK. The clock is first divided by an even prescale
value CPSDVSR in the range 2-254, and is programmed in SSPCPSR. The clock is divided again by a value in the range 1-256, that is 1 + SCR, where SCR is the value programmed in SSPCR0.

where SSPCLK is wired to clk_peri. By default clk_peri is attached directly to the system clock. So I would guess that the serial bit rate would be some constant ratio to the clk_peri. How would changing the clock phase affect that?

@tgiacchi
Copy link
Author

tgiacchi commented Jan 15, 2024 via email

@carlk3
Copy link
Owner

carlk3 commented Jan 16, 2024

I tried using

        spi_set_format(spi_p->hw_inst, 8, SPI_CPOL_0, SPI_CPHA_1, SPI_MSB_FIRST);

and I can't get the card to initialize. Even at 122,070 Hz, it gets CRC errors on CMD0, "Go Idle State", which is the first command.

@tgiacchi
Copy link
Author

tgiacchi commented Jan 16, 2024 via email

@tgiacchi
Copy link
Author

tgiacchi commented Jan 16, 2024 via email

@carlk3
Copy link
Owner

carlk3 commented Jan 16, 2024

So strange. Guess it depends on the card.

Never mind, I was using the wrong hw_config.c file. D'oh!

Now that I have gotten that straightened out, I can't get any of these cards to initialize with SPI_CPHA_1:

In all cases, the SPI CLK is running at 398089 Hz. (The SD Specification says that initialization must happen at 100-400 kHz.)

@tgiacchi
Copy link
Author

tgiacchi commented Jan 16, 2024 via email

@carlk3
Copy link
Owner

carlk3 commented Jan 16, 2024

If you have someplace I can post my code , maybe I'm missing something

You can always fork this repository on GitHub.

@tgiacchi
Copy link
Author

tgiacchi commented Jan 16, 2024 via email

@carlk3 carlk3 mentioned this issue Sep 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants