-
-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement microphone input & pitch detection #12
Comments
regarding pitch detection:
|
I find aubio's YIN implementation good for pitch detection. C library. |
The Unity API seems to slow for real-time (i.e. around 33ms at 30 FPS) Microphone Input. The bottleneck seems to be Thus, I think we need an external lib (propably C/C++) to capture Microphone input fast enough. |
This is wrong. Unity API of the mic is not the bottleneck. I was using the API wrong. Anyway, as daniel-j pointed out, aubio implements some nice algorithms. I found a C# binding of its API called Aubio.NET. Yay! |
As far as I understand copyleft software licensing, you may not use any GPLv3 licensed code or library in anything that is not GPLv3 licensed - not even when just dynamically linking to it at runtime. Thus, to use Aubio, this project would have to change its license to GPLv3. Personally, I do not think that this is necessary, and we should be able to (for example) self-implement CAMDF or anything similar like a autocorellation based algorithm. There exist plenty of public domain licensed algorithms that do pitch detection from audio samples just fine. Regarding the GPLv3 only allowing use in GPLv3 licensed software when publishing it, see https://softwareengineering.stackexchange.com/questions/204410/how-are-gpl-compatible-licenses-like-mit-usable-in-gpl-programs-without-being-su |
@achimmihca I did some more research on this topic and it turns out, there is no legal way to integrate Aubio into UltraStar Play and then continue using the Unity framework. We could try contacting the authors of aubio and aubio.net and ask them for a MIT licensed version. |
Uhhh, what a bummer! I will do some refactoring such that the algorithm for the pitch detection can be swapped more easily. However, I am not that much into signal processing. I have no idea how complex YIN or autocorellation actually are. Maybe, it is not that difficult to implement? I will leave this task to someone else ;) |
Long time USDX fan here, google sent me here as I was wondering how USDX did pitch detection. Im writing a tool to automate the conversion of mp3s to ultrastar files. I have tried enhanced autocorrelation by Tolonen et. al. (A Computationally Efficient Multipitch Analysis Model - Tero Tolonen, Matti Karjalainen) in both python and C, and found it to be very fast and reliable for test vocal tracks. Here is a simple python implementation: https://gist.github.com/anjiro/e148efe17c1e994981638b1a0c6d0954 And one based more closely on Audacity's implementation: It works fine with a sliding window size N of 512 samples, and the most expensive term in the calculation is just O(2xN*log(N)) as it needs only 2 N sized FFT's per window. If you wish to quickly take a look at the output of the algorithm, then load up a vocals track in Audacity, select spectrogram view and choose the algorithm Pitch (EAC) in the spectrogram settings dialog box. A sample of the EAC of the vocals track from a famous pop hit song: |
@psarkozy current USDX uses the CAMDF algorithm (which is public domain licensed). It was implemented as part of this pull request: UltraStar-Deluxe/USDX#461 |
Done. To change the pitch detection algorithm, one has to provide a different implementation of IAudioSamplesAnalyzer in the MicrophonePitchTracker. |
With that pull request merged, I did some comparison tests. A friend sung a song in ultrastar deluxe with above 9500 points on difficulty medium and he recorded that with Audacity. Then I used vbcable driver pack on windows to have a virtual audio output device directly be connected to a virtual audio input device as karaoke microphone. I swapped the song.ogg file, corrected the GAP in the song.txt and then have UltraStar Deluxe and UltraStar Play do these songs, thus, the games both graded the same audio file against the same ultrastar txt song file. Tests with other songs showed mostly similar results. Anyways, the pitch evaluation + grading should still be improved. Increasing the audio sample-window-size to roughly (one third of?) the length of one beat of the ultrastar song txt file (and then averaging the thirds pitches) will probably improve filtering out noise and thus improve pitch detection. |
First, I must say this is a very smart comparison! Second, with the better pitch detecion I noticed that the PlayerNoteRecorder is very buggy. It records notes where none should be. Furthermore, sometimes notes dissapear and popup at a different position. I will have to look into this again. Scoring is also still buggy. The PerfectSinger script, which simulates singing the expected notes of the song, receives a total score near 10000, but not 10000. This could be some rounding issues / floating point inaccuracy with the current approach. It would be better if calculating the player score would be done with integers and a different formula. I will have to look into this again. |
@basisbit have you re-done the comparison with USDX recording and playback in Play? I would be interested in how the results changed with the latest changes. Anyway, I suggest to close this ticket because basic mic input and pitch detection is working so far.
I just tested this. Unity detects the changes. For example, in the recording options scene the label "Hardware not connected" goes away when plugging in the mic and selecting the mic again. The Unity API does not seem to have an event that we can hook into to detect the changes. Still, I think it is sufficient because there are no crashes when plugging in a new mic and the user will most likely try a manual refresh of the recording device. When removing a connected mic, the MicData buffer is not receiving new samples from the mic. Thus, in the sing scene for example, the last sung note will be repeated until the end of the song. Again, I think this is sufficient because there are no crashes and the behaviour is somewhat reasonable.
Is this also related to mics or do you mean other input / output devices? It feels like the same question as above. |
We are currently at ~ 6400 points with UltraStar Play at difficulty easy compared to ~ 9530 with UltraStar Deluxe and difficulty medium. |
@basisbit Please redo your benchmark. I am curious how it performs with the latest changes. Anyway, I suggest to close this issue now as its main feature has been implemented (mic input and pitch detection). |
Have you guys had a look at Vocaluxe? The guys developing Vocaluxe were first developing Ultrastar Deluxe 'til they reached a point where they wanted to startover fresh and make a much better game engine. So about 10 years ago they left Deluxe to wither and started developing Vocaluxe instead. And they put a lot of emphasis on note detection and latency. Making it superior to the Ultrastar clones. Perhaps your're past this stage, but since I heard of your project to do I thought I'd just say this. And if we ever were to switch from Vocaluxe, the game engine has to be as good as Vocaluxe's. And as we have 14.000+ entries in our highscore database it would be nice if the scores were comparable to Vocaluxe's 'cause we're not starting over from scratch again :) |
@daggeg what highscore database? |
Ah, that's too bad. Did not know that. |
Thanks for the hint. I took a look at their code. I did not find out exactly what they do, there are multiple pitch tracker implementations. However, this way I found dywapitchtrack. It is MIT licensed and implements a wavelet algorithm which seems to be very fast and accurate. I migrated the code (just a few functions) to pure C# for USPlay and it works great so far! Even without the tricks and the rounding hack that I implemented to make the CAMD pitch detection feel more reliable. So, I will create a new PR soon with this and an option to choose the pitch detection algorithm. Then everyone can evaluate it by singing some songs. |
Another important change that I have planned is about the way notes are recorded from incoming mic samples. Instead, it might be better to buffer samples of some frames and analyze the samples that make a beat, or event multiple beats of the same note. Buffering enables to analyze with some surrounding sample context (in both directions). Furthermore it should make the pitch detection less frame-rate dependent. |
The new stuff works like a charm. Finally :) ! |
detect what pitch(es) was "sung" by the player(s) since last time it was checked (polling from some pitch detection service per microphone).
support multiple channels per device-> Support multiple channels in recording device (e.g. left and right microphone) using PortAudio #85auto-detect system total audio latency by playing 3 different pitches and checking for each mic how much latency there is until reception of these tones-> Audio latency detection #86The text was updated successfully, but these errors were encountered: