Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I2S microphone stutter with IDF v5.0 (IDFGH-9764) #11098

Closed
3 tasks done
jayavanth opened this issue Mar 31, 2023 · 18 comments
Closed
3 tasks done

I2S microphone stutter with IDF v5.0 (IDFGH-9764) #11098

jayavanth opened this issue Mar 31, 2023 · 18 comments
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally Type: Bug bugs in IDF

Comments

@jayavanth
Copy link

Answers checklist.

  • I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
  • I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

IDF version.

v5.0

Operating System used.

macOS

How did you build your project?

Command line with idf.py

If you are using Windows, please specify command line type.

None

Development Kit.

Custom Board with ESP32-S3-WROOM-1-N16R8

Power Supply used.

USB

What is the expected behavior?

Clear sound from the mic

What is the actual behavior?

  1. Noisy and loud sound when recorded with 8-bit
  2. Stutter with 16-bit

Steps to reproduce.

Mic: CMM-4030D-261-I2S-TR (Datasheet)
IDF version: 5.0

I used the examples/i2s/i2s_recorder and modified it for TDM instead of PDM.

My init_microphone() function

void init_microphone(void)
{
    i2s_chan_config_t chan_cfg = I2S_CHANNEL_DEFAULT_CONFIG(I2S_NUM_AUTO, I2S_ROLE_MASTER);
    ESP_ERROR_CHECK(i2s_new_channel(&chan_cfg, NULL, &rx_handle));

    i2s_tdm_config_t tdm_rx_cfg = {
        .clk_cfg = I2S_TDM_CLK_DEFAULT_CONFIG(CONFIG_EXAMPLE_SAMPLE_RATE),
        /* The default mono slot is the left slot (whose 'select pin' of the PDM microphone is pulled down) */
        .slot_cfg = void init_microphone(void)
{
    i2s_chan_config_t chan_cfg = I2S_CHANNEL_DEFAULT_CONFIG(I2S_NUM_AUTO, I2S_ROLE_MASTER);
    ESP_ERROR_CHECK(i2s_new_channel(&chan_cfg, NULL, &rx_handle));

    i2s_tdm_config_t tdm_rx_cfg = {
        .clk_cfg = I2S_TDM_CLK_DEFAULT_CONFIG(CONFIG_EXAMPLE_SAMPLE_RATE),
        /* The default mono slot is the left slot (whose 'select pin' of the PDM microphone is pulled down) */
        .slot_cfg = I2S_TDM_PCM_LONG_SLOT_DEFAULT_CONFIG(I2S_DATA_BIT_WIDTH_16BIT, I2S_SLOT_MODE_MONO,
                                I2S_TDM_SLOT0 | I2S_TDM_SLOT1 | I2S_TDM_SLOT2 | I2S_TDM_SLOT3 ),
        .gpio_cfg = {
            .bclk = 6,
            .din = 5,
            .ws  = 1, //LRCL (WS)
            .invert_flags = {
                .bclk_inv = false,
            },
        },
    };
    ESP_ERROR_CHECK(i2s_channel_init_tdm_mode(rx_handle, &tdm_rx_cfg));
    ESP_ERROR_CHECK(i2s_channel_enable(rx_handle));
}

Although this not a TDM, the best output I got was when I used this config. With STD configs, I got very noisy audio or just silence.

Sample for 8-bit audio: https://voca.ro/1lqOUedU4wOa (WARNING: VERY VERY LOUD)
Sample for 16-bit audio: https://voca.ro/11n2lNsQuS0r

This is what it looks on Audacity. There are some empty slots but they don't look like they all have the same size
Screenshot 2023-03-31 at 12 19 50 PM

Debug Logs.

No response

More Information.

No response

@jayavanth jayavanth added the Type: Bug bugs in IDF label Mar 31, 2023
@github-actions github-actions bot changed the title I2S microphone stutter with IDF v5.0 I2S microphone stutter with IDF v5.0 (IDFGH-9764) Mar 31, 2023
@espressif-bot espressif-bot added the Status: Opened Issue is new label Mar 31, 2023
@cdluv
Copy link

cdluv commented Apr 1, 2023

I'm no expert with ESP-IDF but I have experience with data sampling... I could be way-off, but worth checking to at least rule-out my observations :

  1. The 8 bit recording was obviously distorted, but there we no noticeable "gaps" or "glitches" with the recorded audio.
  2. The 16 bit recording was "choppy" ... this normally implies a timing problem (too slow) recording of samples.
  3. The example from https://github.com/espressif/esp-idf/blob/v5.0.1/examples/peripherals/i2s/i2s_recorder/main/i2s_recorder_main.c is storing data to an SD card. Storing data to an SD card will take CPU cycles to store data to it. What frequency are you running the SPI bus, and is the SD card "clean" in terms of other files/directories that could exist on it? Please log write start/write end-times, as any delays here can cause choppiness.

From these observations, the ESP 32 in 16-bit mode wasn't keeping up with the recording of data. Can you confirm the configured CPU speed of the ESP32?

Also, from the docs, can you clarify if your app is going into light sleep, by checking the CONFIG_PM_ENABLE setting, according to your setup?

https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/peripherals/i2s.html

Details

When the power management is enabled (i.e. CONFIG_PM_ENABLE is on), the system will adjust or stop the source clock of I2S before going into light sleep, thus potentially changing the I2S signals and leading to transmitting or receiving invalid data.

@jayavanth
Copy link
Author

Thanks cdluv! Those are some good points

Since the board I am using doesn't have SD card reader, I have changed the example to write to PSRAM instead. After everything is written into PSRAM, I send the array over via socket to my computer so I can play the .wav file.

Initially, I was streaming to socket right after i2s_channel_read but when I saw the choppy audio, I moved everything down. So only after I write everything to the PSRAM, I connect to wifi and then do the socket stuff.

CPU frequency: 160 MHz
PSRAM: 8MB, Octal Mode at 80MHz

Should I increase the CPU speed?

I looked at the settings and I see that

# CONFIG_PM_ENABLE is not set
CONFIG_PM_POWER_DOWN_CPU_IN_LIGHT_SLEEP=y
CONFIG_PM_POWER_DOWN_TAGMEM_IN_LIGHT_SLEEP=y

This shouldn't affect I2S, right?

@jayavanth
Copy link
Author

I also tried speeding up (240MHz) and slowing down (80MHz) the CPU but there was no difference

@cdluv
Copy link

cdluv commented Apr 1, 2023

Thanks for the info. If speeding up the CPU to 240MHz didn't help, then the choppiness is coming from elsewhere. Are you sending any information over UART while capturing the data? If so, try to not send output to the console, as this will cause an ISR to do work in the background when you least want it to.

I'm not sure what the default setting for CONFIG_PM_ENABLE is, so you could try setting it to n, just in case the default value is y.

CONFIG_PM_ENABLE=n

Could you share your code which takes the I2S samples and copies to SPIRAM?

One other thing that might be causing an issue is if Wi Fi is enabled (it shouldn't as it runs on a different core). Try disabling WiFi while you record the I2S data:

(Note adc_power_off might be deprecated - I haven't checked, in which case comment out any annotated lines of code below:

#include <WiFi.h>
#include <esp_wifi.h>
#include "driver/adc.h"    // DELETE 
 
void setup() {
    adc_power_off();    // DELETE
    WiFi.disconnect(true);  // Disconnect from the network
    WiFi.mode(WIFI_OFF);    // Switch WiFi off
}

Without looking at the code, I'm out of helpful ideas right this second.

@jayavanth
Copy link
Author

jayavanth commented Apr 2, 2023

  1. I disabled output to console
  2. I changed CONFIG_PM_ENABLE=n. The documentation also says default in n
  3. Wifi.h was not found but I found the equivalent wifi_disconnect(); from $IDF_PATH/examples/common_components/protocol_examples_common/include/protocol_examples_common.h. But that didn't work because it complained about not being in the right state. So I think you can only turn off if you turn it on.

With these 3 I still see the stutter. I'll check again if I can turn wifi off. But I'm only turning wifi on after I copy to PSRAM.

I have pasted my code here: https://gist.github.com/jayavanth/61938c8b52d1b5b3f24dc7e06ba7da89. Important parts are commented with cdluv

Thank you for your help! 🙏

@cdluv
Copy link

cdluv commented Apr 2, 2023

Thanks for sharing your code, and thanks for feeding back. I've learned something from you :-) I haven't compiled your code - just walked through it in my head and found a few things of slight concern - any combination of which could contribute to the overall choppiness you're encountering.

  1. You're making a bunch of PSRAM memory allocations up-front, for every sampling iteration (or "rows" as you call it) - is there any reason why you're not just writing to a contiguous block of memory?
  2. Not sure if you commented out this line of code on line 123? This would slow things down even if not printing...
    printf("[0] %d [1] %d [2] %d [3]%d ...\n", i2s_readraw_buff[0], i2s_readraw_buff[1], i2s_readraw_buff[2], i2s_readraw_buff[3]);
  3. When syncing you're using "bytes_read" as the size of the sample (L122, L139 and L231). I believe it is possible for this value can be different every time i2s_channel_read() is called, which means that you need to store this value too if you're going to re-construct buffers into a single socket stream, using your 2D array.
  4. L333, there's a call to ESP_ERROR_CHECK(esp_netif_init()); which means the network interface is being initialised. Try commenting it out.

MOST IMPORTANTLY:
5. Try using DMA buffers if possible. It is possible to "double-buffer" recordings such that while you are copying audio to SPIRAM, the ESP32 will continue to populate audio into the next available DMA buffer so that your code has some "breathing space" to process the audio buffer and not worry about timing so much. It's going to be a bit of a rabbit hole, but imho, worth the pain the understand. Here's a link which explains DMA: https://www.atomic14.com/2021/04/20/esp32-i2s-dma-buf-len-buf-count.html ...

To fix 1 and 3, may I suggest you create one large array for your sample output (if malloc allows it, of course), and copy your sample buffer to the next writable positions in the PSRAM buffer? If that's not possible, points 2, 3, and 4 remain relevant.

Good luck!

@L-KAYA
Copy link
Collaborator

L-KAYA commented Apr 3, 2023

Hi @jayavanth , thanks for providing the codes! I've looked into your codes and the issue you described, there are some suggestion may help you to solve the issue:

How The Issue Occurs

I found you allocated a int16_t buffer on stack at line 92:

static int16_t i2s_readraw_buff[SAMPLE_SIZE];

whose size is SAMPLE_SIZE * sizeof(int16_t) = SAMPLE_SIZE * 2.

But when reading the data at line 122, you only read SAMPLE_SIZE which means a half of the buffer is empty.

Then you copy the buffer to psram, however, you uses buffer index instead of memcpy, notice that the step of the buffer index is equal to the size of int16_t, which means you copied the whole buffer into psram with a half empty data, so that the stutter occurs.

For summary, the forth parameter bytes_read of i2s_channel_read is the BYTE number (i.e. SIZE) that supposed to be read by I2S, not the LENGTH of the source buffer, you may get missed up with these two meanings.

How to Solve

Solution 1

As the issue is caused by the inappropriate buffer length, you can fixed it easily by give the whole size of the i2s_readraw_buff at line 122:

if (i2s_channel_read(rx_handle, (char *)i2s_readraw_buff, SAMPLE_SIZE * 2, &bytes_read, 100) == ESP_OK) {

Solution 2

Actually, you even don't need a temporary buffer i2s_readraw_buff, you can read the data into psram directly by a continuous array instead of 2D array:

int16_t* psram_array = (int16_t*)heap_caps_malloc(ROWS * COLS * sizeof(int16_t), MALLOC_CAP_8BIT | MALLOC_CAP_SPIRAM);

if (i2s_channel_read(rx_handle, psram_array, ROWS * COLS * sizeof(int16_t), &bytes_read, 20 * 1000) == ESP_OK) {
    // on read success
}
else {
    // on read fail
}

Tips

Normally, only if data lost happened, you might need to consider whether CPU is too busy to read all the I2S data, but the stutter contains a complete data which can get rid of the possibility of data lost.

If data lost happens, you can refer to How to prevent data lost section in programming guide.

Really thanks for the great effort contribute by @cdluv ! Hope you all everything goes well!

@jayavanth
Copy link
Author

Thank you both. Ya I think that's definitely wrong. I think the .wav format is also wrong. Let me do all the experiments and I'll report back soon

@cdluv
Copy link

cdluv commented Apr 3, 2023

Great catch and nice work @L-KAYA ! A really clean and on-point answer. 👍 Good luck @jayavanth!

@jayavanth
Copy link
Author

An update:

  1. I tried writing directly to PSRAM and fixed my indexing. But I still see the same stutter. Although the period and amount has reduced a bit seems like. I verified that there are no empty spots during buffer read by setting the default values of PSRAM to anything (for example, 2) other than 0 during memset(). Then when I open the wav file in the hex editor, I don't see the value (2) repeated multiple times

  2. If I do i2s_channel_read in a single shot without a loop, I found that PSRAM was incredibly slow. For example, 15s of recording took 1 minute. Of course that can be lowered by decreasing timeout but the more I decrease it, the more distortion I get.

  3. I tried setting the DMA values dma_desc_num and dma_frame_num to a variety of values but I don't see a difference in stutter. But the sound quality got a bit worse

  4. I am sampling at 16KHz not but still the same stutter

  5. Weird that when I sample at 8KHz and 16-bit, I get the stutter but when I sample at 16KHz and 8-bit I don't get the stutter. But of course, then there is the distortion in the latter.

I have updated my code here. Sorry for the commented lines and raw integer values for indexing: https://gist.github.com/jayavanth/7498ef6e75eed4eec975704c5d0dd292

So I have pretty much given up on PSRAM. I also found some posts to suggest that I2S and PSRAM don't work well together. But I'm not sure if those are still relevant and maybe those issues are fixed.

My plan right now is to do the following:

  1. I have a breadboard setup with a devkit with a different I2S mic. I have this setup working well so I'll check and see if I get the same stutter if I write to PSRAM
  2. I'll see if I can write to SPI flash instead. Hopefully this one works

Thank you both for the suggestions so far. I'll let you know how the above goes

@cdluv
Copy link

cdluv commented Apr 4, 2023

This is very odd, but I can see that the stutter will always occur between the times you have received a block of sound samples, and copying the recorded data to SPIRAM. The measures I suggested earlier should have had a positive effect.

Here's a bullet-proof approach to prevent stutter:

  1. Create 3 DMA buffers in conventional RAM, let's say 3 x 4K buffers (see link in earlier post of mine)
  2. Capture I2S data into your RAM buffer (not that you should only be working with ONE of the 4K buffers at any point in time)
  3. Copy the RAM buffer by appending it into SPIRAM 2D array.
  4. Loop back to 2, if the buffer is not yet full.
  5. Send SPIRAM block to the network.

Using the DMA buffer with conventional RAM means you'll have sufficient time to copy the sampled audio, without stuttering.

Pretty sure that will work because the DMA controller will create an interrupt when the DMA buffer fills up, but also, audio capture will continue in the background. Note that a smaller buffer will create more interrupts, so having about 4K for a buffer will grant you more time to process the data which has arrived.

Once again, good luck! If anything, we're learning loads.

@jayavanth
Copy link
Author

jayavanth commented Apr 5, 2023

The SPI Flash didn't work either. It seems much worse. I don't see any stuttering, but the whole thing sounds robotic. Code here: https://gist.github.com/jayavanth/6226bc7b40f5b7c60cdb5837cc224b90. I was in disbelief when I heard the results from this. Really thought it would do better given the performance of SPI SD card readers

Ok I'll take your advice and re-read and verify everything that you mentioned so far. Especially the last paragraph. Seems like PSRAM would give me the best chance here

@L-KAYA
Copy link
Collaborator

L-KAYA commented Apr 6, 2023

If you are worry about the data that recorded from I2S is not continuous, the tip is to register a on_recv_q_ovf callback by i2s_channel_register_event_callback, it can help to monitor if there are any data dropped due to the queue overflow (which is normally caused by a long polling time while reading I2S).

However, I still suspect the data that be sent via WiFi might be incomplete, could you print all the data to the terminal, to check whether it looks correct before sending it. Also, you can try to move the print to each step (like before writing into flash) to trace where the data go wrong.

@jayavanth
Copy link
Author

You were right. I registered the callback and nothing was dropped. It's probably the wifi. Will investigate that

@jayavanth
Copy link
Author

All right. Found the bug! It was because I had set my socket to non-blocking! Will post the working code in a bit. Phew!

@cdluv
Copy link

cdluv commented Apr 7, 2023 via email

@jayavanth
Copy link
Author

Oh thank you! I thought they were the same

@jayavanth
Copy link
Author

Here's the final code for mono recording. Cached on PSRAM and then sent via TCP socket. Working on getting stereo working..

https://gist.github.com/jayavanth/98873070da289e2a4df2971563061665

Thank you @cdluv and @L-KAYA. Could not have done it without your help 🙏

@espressif-bot espressif-bot added Status: Done Issue is done internally Resolution: Done Issue is done internally and removed Status: Opened Issue is new labels Jun 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally Type: Bug bugs in IDF
Projects
None yet
Development

No branches or pull requests

4 participants