New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[q-mr1] [AUDIO] cirrus_sony: Calibration fix for mono calibration #22
Conversation
Damn sure I had a rebase-branch for these on Q at some point but forgot to submit it, thanks! Looks like something went wrong during conflict resolution, 64f5f62 moves a function down presumably because the new functions were first added below and later moved above. |
@MarijnS95 argh, sorry i am not sure why this happens, i actually changed the function cirrus_do_fw_mono_download to use the cirrus_write_cal_status and cirrus_write_cal_checksum. but i had to move the two functions up before the cirrus_do_fw_mono_download function. so the diff looks a bit strange, but it does what it's supposed to do :/ |
Maybe it might make sense to move the functions in a seperate commit and do the fix in another? |
af0763a
to
b263594
Compare
Ok now it should be more clear what i've actually done in 1b4c272 :) |
Hmm it seems that even with those changes calibration fails in some cases and i am not sure why. 03-11 11:58:52.017 4452 4895 D audio_hw_primary: check_usecases_codec_backend:becf: check_usecases num.of Usecases to switch 0 Any clues? Sometimes the /data/vendor/audio/cirrus_sony.cal file appears, sometimes not. Some more logs: |
@krnlyng It is clear what happened: Angelo introduced those functions above Perhaps it is better to amend that commit and put those functions in the right place directly. Note that we are really not opposed to force-pushing branches where it improves workflow and readability so we might as well reapply those commits with proper ordering on 9.12 as well. Is PA trying to play any audio while the calibration hasn't fully finished? We faced that while building and debugging the support for Edo devices - if anything is played, even during the 6 seconds of waiting for the firmware to complete its thing, it gets stuck in some unrecoverable error state. I don't exactly remember what error that is though. |
@krnlyng please fix it so that I can merge the PR |
b263594
to
3fa3c0b
Compare
Ah yes we tried delaying the playback for a little bit in the calibration case but it wasn't enough (iirc we tried 2-3 seconds, but that is now obvious that this is too short when the calibration is running, 6 seconds is a lot :O). Will try with a bigger delay. |
Last time I was debugging the calibration issues happening when a playback race happened (rather rare at Android first boot) with @MarijnS95 ... but anyway, I've extensively tested the calibration commits and they're perfectly working on my Edo device (dual CS345L41). The sleep(6) was removed in the commit named "cirrus_sony: Loop around DSP state after calibration" - probably something went wrong during the cleanup phase, because effectively that commit doesn't contain the removal of that call, which becomes useless with the waitloop around calibration, as that was meant to be an optimization of that stupid blindly-sleep-until-done thing. Regarding your concern about timing issues, yes - it looks like, as Marijn said, when the audio pipe gets opened issues come around. This is not a huge issue on Android, as the startup is mostly controlled. Finally, this PR should be valid - at least, for Android - as I've been using this code for a month or more. |
Great, thanks! I'll give this a test on Android later and probably provide a branch to reset R to, to share the same clean history. If I may have one more picky question: how about moving the changes that call the new functions right after Delaying playback is a bit tough because calibration uses the same usecase and playback device, so we can't easily block there (well, we know what thread the call comes from, and iirc did some magic tricks). The plain |
We did it like that initially because we were in a big hurry, a screenshot from our brains at the time: brain@Marijn-Angelo$ ./develop_edo
Error: -ENOTIME
brain@Marijn-Angelo$ ./develop_edo --force
Error: Working too hard, try again later But yes you remember right, the audio hal is threaded and the audioserver is threaded, so we got a spkr_prot audio-start callback during "paths discovery" and we would get state changes, but then we sort of solved that with 2abf1a4 ... though there still was some very very rare corner case to be addressed. |
I think we ran out of thread ids, process ids, file descriptors and memory at the same time. Still recovering from the crash. Correct, @krnlyng Perhaps you can litter some logs around the HAL and understand whether audio playback - or anything else poking the mixers - manages to sneak through while a calibration is still ongoing. (@kholk Btw I just found |
Yup, exactly.
It's a bit of a mess, yes, but... at the end of the day, if this is the only solution and the only way to make it work on SFOS and whatever else, probably at this point we should just stop being picky, take the defeat and just do whatever possible workaround on top of another workaround on top of ... :) I mean, in the end, what counts is that stuff works. Not less, not more.
Since he is analyzing an usecase involving pulseaudio and a "strange" connection to the audio hal, that may be really useful.
OMG no, why are you doing this to me :\ |
@MarijnS95 i can move the commit towards the other one, however i am not sure what you mean by splitting it? which parts should be split into a separate commit? or do you just mean shortening the commit message? |
@kholk not saying that we shouldn't fix this, at all, because we really should. There was however no incentive to fix it back then when we were lost for time (we still are?) but this issue hitting SFOS is the right time to restart some debugging efforts.
Correct. @krnlyng To be more specific - you probably already assumed this - place the logs in
I think I was unsure about why
Split it up in one part that calls the new functions, directly after 11f3eef |
@krnlyng please update the PR |
3fa3c0b
to
948628c
Compare
code changes: https://pastebin.com/EneivSTk (this is a log from a failed calibration - i hope that's what you meant) |
Ok we now have a solution for this on our side. Basically if the calibration has not been done yet (e.g. the file doesn't exist) we wait 10 seconds after opening the audio HAL before doing something. And if the calibration was already done (file exists) we do not wait and use the HAL immediately, this should be good enough since the calibration will (hopefully) only be done on the first boot. Together with this PR this seems to work. But two questions: |
That's pretty horrible; we should just prevent audio from being played in the HAL, until the device is absolutely ready. I don't fully remember the entire chunk of code but there is no guarantee that the Amp/DSP is up and ready after we've cached the calibration data.
Yes. Silence is for it to detect noise on the line (apparently this is going through an analog signal to the Amp/DSP combo, where it is converted to digital again for "DSP stuff" and finally analog and amplified before it reaches the speaker...), after that the DSP does the whoop all by itself. |
That's probably the right wait time for a horrible temporary workaround, I don't think you'd need more than that, but it's possible that you get the thing calibrated in less than 6 seconds.
As Marijn said, yes. |
@krnlyng Would you be able to debug this further from a HAL perspective? Afaik the PCM device should not be opened by the HAL if |
Looking at those logs it seems like it happily continues after
Anyway, seeing something weird on the calibration thread (
I hadn't expected to see |
@krnlyng I'm picking this back to 9.12 to fix the mono ctls and have the timeout removed, but there are a ton of conflicts because 13b245e supposedly wasn't picked - meaning you had the same conflicts the other way around... EDIT: Yeah, it even seems like Angelos commit that adds If I'm not mistaken these four are not on
(In newest-to-oldest order) Any reason why they have been unintentionally forgotten about or intentionally not picked? I'd appreciate if we can keep both branches as similar as possible. Attempts to resolve conflicts show that there is quite some divergence going on already. |
Hi, sorry i got a bit distracted by other things. We tried investigating this a bit from HAL perspective whether we can come up with a better fix than the delays, but unfortunately time didn't agree. |
@krnlyng Thanks! I have also backported your patches into the existing 9.12 tree, perhaps it's just as easy to adopt the entire tree here: https://github.com/MarijnS95/vendor-qcom-opensource-audio-hal-primary-hal/commits/9.12-cirrus-patches starting from the initial missed patch, and compare that to what you have here. Those should, theoretically, pick to 8.1 without any conflicts. |
Hey, sorry to ask again but before i push something/broken/useless: It seems 1ee5303 is missing from https://github.com/MarijnS95/vendor-qcom-opensource-audio-hal-primary-hal/commits/9.12-cirrus-patches and that causes conflicts when picking. I can of course fix those but is that the right approach here? maybe 1ee5303 should also be on https://github.com/MarijnS95/vendor-qcom-opensource-audio-hal-primary-hal/commits/9.12-cirrus-patches ? |
Yes that patch is removed intentionally from my local branches because it is a really dirty way to cover up for https://github.com/sonyxperiadev/kernel/pull/2345/files. DAPM should turn on and off the hardware when it is (not) used but it doesn't always seem to come out of hibernate, and 1ee5303 has proven locally to not resolve that problem at all. Yet everyone keeps insisting to have that patch :( Feel free to test it out, pick/backport it, I have no stake in this nor enough time in any way shape or form to investigate what is going on. |
This nice helper is around to simplify repetitive bits of code; put it to good use. Signed-off-by: Marijn Suijten <marijns95@gmail.com>
It is not necessary to pass pointers to strings, because the pointer itself is never modified to point somewhere else (ie. to a different or new string allocation). Signed-off-by: Marijn Suijten <marijns95@gmail.com>
CIRRUS_CTL_NAME_BUF is only defined to contain 40 bytes which is well within our budget for the stack. It is pretty much free (besides a stack-pointer increment/decrement) in contrast to a costly heap allocation and free. Signed-off-by: Marijn Suijten <marijns95@gmail.com>
Initializing ret to -EINVAL, but never updating it when everything is sucessful results in a misleading: audio_hw_sony_cirrus_playback: cirrus_do_calibration: Cannot save calibration to file (-22)!!! Signed-off-by: Marijn Suijten <marijns95@gmail.com>
These functions are not used; it's rather wasteful readability-wise to retrieve them. (The others are not used yet... but will be in future commits) Signed-off-by: Marijn Suijten <marijns95@gmail.com>
Edo has the cirrus AMPs on a different backend (SEN_MI2S_RX to be exact, instead of PRI on seine). The mixer listed here is to enable playback through the deep-buffer usecase, so we can simply enable that path to get the right backend enabled regardless of the device based on its mixer_paths. At the same time "clarify" hardcoded PCM id `0` by reading it from the configuration (potentially overridden by audio_platform_info.xml) instead. Signed-off-by: Marijn Suijten <marijns95@gmail.com>
948628c
to
14d629e
Compare
Not worth mentioning (in the commit, in the PR is fine) that there was a conflict due to some added lines.
What do you mean with this? |
Fixed compilation due to device_list not being avaliable in q-mr1. Enable silent audio playback (for calibration) through more verbose but more specific APIs instead of manually poking around with the paths. Signed-off-by: Marijn Suijten <marijns95@gmail.com>
This pointer strangely has a value immediately on Edo, but not on Seine.
cirrus_format_mixer_name returns the number of formatted characters (result of snprintf). Returning this positive number is not always a good idea, like when check_error_state returns 22 as the length of a mixer control, which is caught by the `==0` check resulting in a full reset. Secondly, a postive length of 22 together with an actual, expected failure of -EINVAL which conveniently is -22 results in 0; no error at all!! Signed-off-by: Marijn Suijten <marijns95@gmail.com>
14d629e
to
689942b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@krnlyng As mentioned before it is not worth mentioning "Resolved conflicts due to extra 1ee5303." nor "Cherry-picked from aosp/LA.UM.9.12.r1 and conflicts resolved.", in particular when we want to reapply this tree (mono calibration) back to 9.12. If you can remove that this seems ready to be merged from my side.
Leaving the comment in the code about list_init
is pro-ba-bly fine, we'll have to fix that up anyway when reapplying on 9.12.
There is no need (and it's wrong) to call pcm_start before writing the data, as we can simply retrieve, prepare and write the silence data to the PCM instead. Also, while at it, add some locking around to make sure that we don't race while setting module states, especially in corner cases where we need to calibrate the speakers before starting processing (very, very unlikely case that should anyway never happen). Co-authored-by: Marijn Suijten <marijn.suijten@somainline.org>
The new firmware features more creativity, which means that the mixer names have changed. Also, for some reason, when we've got a dual amp setup, the DSP refuses to load the R-SPK firmware by caprice if we load the L-SPK one before that, even though they definitely have no real correlation between each other (or at least not firmware-wise). Even though there are *way* better ways to do this, time does not allow me to, so this is a very fast workaround to get the audio calibrations kicking.
On devices with multiple CS35L41 amps the calibration process is a bit slower for.. reasons. First of all, the DSP likes to just take time to get back alive after the calibration run succeeding (reasons unknown: firmware is closed) and there is no way to know when it precisely comes back to life. Then, this task being now very much time consuming brings more issues to the table, like the audio HAL trying to playback sounds from the OS before the calibration run is done: this may actually interact with the calibration silence and/or confuse the DSP and make it crash, which is why we now need to carefully set and check the module state, meaning that we are obliged to check and forbid audio playback if this software is waiting for DSP calibration. The result of this is that *at least* Sony Edo, featuring this kind of dual-amp cofiguration, can now calibrate and switch to protection FW without requiring a system reboot, even though it's going to be a bit slow: this was ignored for the sake of safety and audio quality through the internal speakers and it is very important to recalibrate this when we have the opportunity to, as the TA-stored calibration cannot be accounting for hardware aging while, with this, WE CAN.
Instead of waiting for a arbitrary amount of time in any case, a better user experience may be (in some cases) achieved by looping through the DSP state: if we succeed to change the DSP firmware mixer (to load the new FW) before the max wait, that's going to be... well... less time to lose. This actually brings benefits depending on the specific device and on the specific boot: for example, my PDX203 sometimes takes as much as 5 seconds, sometimes just two, and this probably depends on the boot order since it's entirely async, which gets different depending on the device state before actually booting, as the hardware will be in different states and some daemons init time will be skewed; For example: - Bootloader version X will leave some HW in a different state than version Y; or - The device is booting from a fastboot continue command; or - The device is cold booting.
We use fp_enable_snd_device (and friends) to set up the right routes/paths and backends before trying to use the speakers, with the annoying caveat that start_processing is called. Usually after a reboot this only spews `Forbidden. Calibration is in progess...` once, but if AOSP already tries playing audio it'll get stuck in a loop with lots of pops and clicks. Signed-off-by: Marijn Suijten <marijns95@gmail.com>
Instead extend the retry loop to include the 6 seconds in which previously nothing was done.
Ok, thanks for the clarification, i hope it's alright now, i've reworded the two commits and dropped those 2 lines. |
On SailfishOS we observe that the calibration of the cirrus device fails, noticeable by a frequency sweep noise during bootup.
Fix this by cherry-picking the related fixed from the aosp/LA.UM.9.12.r1 branch. And also fix the calibration for mono.
Part 1: Cherry-pick #20 and #21 onto q-mr1 and fix conflicts.
Part 2:
64f5f62: Seems in mono calibration the different firmware names need to be accounted for too. Probably good to cherry-pick this onto aosp/LA.UM.9.12.r1 too.
af0763a: Initially i thought this was my issue but it turned out it wasn't, but i guess it's still a nice addition. Can drop if wanted.