-
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model test results - model 20240117 #23
Comments
Edited with instructions / guidance on an example workflow on how to clean up the data after a play session, and point to the helpful |
Awesome! Thanks for the feedback @madmaximus101 . A few questions and comments below.
Do you have listen_key set to a loud or clicky key? I have a theory that it’s picking up some noise as “Blue” before you start speaking. It might also be retaining some audio before you press the listen_key, which would be a problem I’d need to fix in code if it’s the case (though I thought I already fixed it!). It could also be the model but I want to narrow down possibilities. You shouldn’t need to speak as a TV presenter for it to work accurately, if you do there’s something wrong.
Is C4 active in your grammar module? It isn’t by default. If it is a valid command in your grammar module, but it’s not being recognised, please confirm / let me know. |
My listen on/off key is set to my mouse thumb button it is not really noisy. i have noticed however if i breath or sigh, or if I'm typing away it will recognise noises and attempt to decode them. if i don't want any listen padding at start and end or any automatic voice on/off feature which setting do i change? edit: will try listen key toggle 2. edit: i am using the experimental model as provided. |
@madmaximus101 If listen_key_toggle is set to -1, it will always be listening for either YellFreeze or NoiseSink, so it’s fine for it to decode noises, as long as it doesn’t Yell without you saying “freeze” (or similar)… unfortunately it will likely yell at noises at least sometimes unless you have a very quiet environment and good mic. Just let me know if it’s truly unplayable and if it’s worse than the base model. |
Listen key toggle 2 seems to be better, hot mic always on seems to be a much better experience overall, no random ghost or added-on commands, much higher success rate in general, i did have some misheard commands, either due to not being clear enough or i assume to quick speaking. With some of the retained audio i've noticed i seem to have a tendancy to breath in or make an innitial "opening mouth sound" as i click the hot mic button or just after i did. May have to learn to not do that lol. I also noticed my mic volume was way up. That might be a contributing factor also - potential for minor distortion of sound to ruin things etc. Will lower mic volume lol. Edit: for reference my headset is the sennhieser gsp 670. I'd say it's better than average quality for sure. Here is a link to a gameplay session using the Listen key toggle 2 hot mic always on. Experimental Kaldi model as provided. I am attempting to run the test thing in Powershell. I think i'm running the command correctly & nothing is happening? the command runs, but i get no results, output or files generated? |
@madmaximus101 check in the retain.tsv, are the referenced file paths to the .wav files correct? e.g. ./cleanaudio_cmds/retain-123.wav |
i downloaded the tacspeak app & kaldi model as is. Haven't changed anything. If I forgot an instruction in regards to these needing filepaths modified i apologise. I noticed in your powershell window. The '. after test_model were grey. In my powershell window the '. after test_model is blue. Thought i'd point that out just incase that means anything. |
@madmaximus101 You’re running the command with ./cleanaudio_cmds/retain.tsv whereas it should probably be ./retain/retain.tsv It doesn’t matter what the path is as long as:
|
Ahh, a simple filepath error. I copy pasted the example command given
without a second thought 😅.
Appreciate the patience & help mate.
…On Tue, 23 Jan 2024, 10:05 pm Joshua Webb, ***@***.***> wrote:
@madmaximus101 <https://github.com/madmaximus101> You’re running the
command with ./cleanaudio_cmds/retain.tsv whereas it should probably be
./retain/retain.tsv
It doesn’t matter what the path is as long as:
- ./somedir/retain.tsv is a valid file and path
- the .wav file paths in the retain.tsv are valid files and paths.
—
Reply to this email directly, view it on GitHub
<#23 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BEXH6Y7H7AZW2D2WDPLFT6LYP6VBZAVCNFSM6AAAAABCDGEX5WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBVHE3DSMZSHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
ok, i am at a point where i feel i am ready to start doing the initial collecting of data in somewhat of an organised manner, will start collating data and using the scripts etc. Is there a map in particular you would like me to play to reference for your own comparisons to make things easier? As less variables as possible etc. Any particular commands or ways of speaking you would like me to try? |
Some initial notes & basic observations so far with some newly found quirks, yay! 😛 Lowering mic gain (I had my mic gain set to silly levels (in-game) for some reason, i don't remember doing it or why i would lol? it is now at 100%) seems to have helped a lot with accuracy of words & randomly detected noise attempting to be decoded. I no longer get the tapping of my keyboard or loud sighs, or mouth noises being picked up & tacspeak trying to decode it. I am the only person in my house with 2 cats. I was using Listen key toggle -1 when it was their dinner time, they were meowing right underneath my chair & the Noise sink feature didn't detect it at all & there were no noises attempted to be decoded or false commands given. Something of note a weird quirk i'm assuming with how Ready or Not is designed with any Kaldi model I've used. Some of the speech commands were completely different from what i said. I should of realised this instantly as I've pointed this exact issue out before lol. Took me a bit to figure out what the F**K was happening. I was seriously HUH!?? Is there a way to make a command so when a command for wedge, mirror, c2 is given on an ajar door. The operator closes the door & if possible continues the intended command(s) given? I have also noticed another door quirk again i think due to Ready or Not's design. Edit: in hindsight i realise i should of tested this with other commands such as fall in, on me, cover me, commanding one team to do these commands whilst looking at the door the second team was stacked up on. Will update comment with this result if you would like that. Testing the listen key toggle 2 setting with the experimental kaldi model I've noticed if I stop briefly then continue it will give an unintentional command mid-sentence (blue team "slight pause" breach and clear) I am attributing this to my own cautious bias & my own learnt speech habits interacting with Tacspeak. When I speak in one fluid continuous sentence it does seem to work, although listen key toggle 2 setting does sometimes pick up my random mouth noises when i sigh louder than normal, or if I make a "tutting noise" haha. This issue is very, very much reduced with mic gain now at 100%, basically almost a non-issue at this point. The Listen key toggle -1 setting with the experimental Kaldi model also seems to have less unintentional noises detected with my mic gain now at 100%. If I mucked up a command (brain fart) i would let go of mouse thumb button which would lessen the impact of the error. I would then press thumb button again & issue the fall in command to stop the command currently happening which would come out correctly. I know there is a halt, cancel, stop command but my brain just thinks of fall in in the moment lol. |
@jwebmeister Noisesink seems to work as intended with high accuracy in the few times it's activated since i've corrected my mic gain to 100%. If I do make a noise detected by Noisesink such as a burp, cough, or i hit the desk it activates. I will test Noisesink with words & phrases a person might say in surprise, fright, dissapointment or anger. |
@jwebmeister newly found quirks aside i am of the opinion the Listen key toggle -1 setting is pretty good and pretty much working as intended...now that my mic gain is at appropriate levels - again i apologise. I will test this setting further whilst watching out for where i'm looking when giving said commands. |
The "F word & F you" are often picked up as on_recognition (INFO): KaldiRule(16, ReadyOrNot_priority::YellFreeze) | drop |
i have some results of the testing here. There was only one mistake out of the short run of commands i did here in-game as a little test run to make sure things were running as they should with things. I Wasn't sure how to change this in the text files to reflect the result so i will explain. The command recorded red team secure area I actually said red team kick and clear. I said this a second time more clearly and it gave the correct command. |
Thanks @madmaximus101
I don't think there's an easy fix, other than re-training the model, and my previous attempts to do just that didn't result in any improvements. However, current options or work-arounds are:
Not without first specifying via speech that the door is ajar, e.g. "wedge the ajar door" instead of "wedge the door". Similar to the multiple "door", "doorway", "hallway" issue, it's a problem more effectively solved from the game devs (Void) side of things, as implementing a workaround from tacspeak will reduce speech recognition accuracy. I'll consider revisiting this if there's no updates from Void that address some of the command menu quirks.
I thought I was going crazy, thank you, this explains quite a lot. I only ran into this issue when playing Ides of March (so far). If you've tested it, or willing to test it, can you confirm what the extend is of it changing the team selection, what commands it affects outside of just breach and clear?
The same thing happens with my speech. If you change |
@madmaximus101 cheers for the video, it's extremely helpful. Based on the video, Tacspeak and/or the experimental model isn't performing "good enough" imho (though I also need more test data). As you said, you're having to speak as a newscaster for it to be reliably accurate, and there were some commands spoken that were misrecognised for no good reason that I could determine, e.g. "on me" was recognised as "team remove wedge"!? It failing to recognise C4 as C2 (or written out as "c two") is reasonable to me as it's not a valid command, unless the grammar module has been explicitly changed to recognise "c four" as an option... then I definitely want to know about it. In hindsight I realise there's probably too much manual effort required from testers to get good test data (as opposed to being an automatic process). For example, the test data will only show misrecognitions if the user manually cleans and updates the data, and the test data won't show failed recognitions (i.e. not recognised commands) unless the user mentally notes it or records the full play session. I haven't got any good ideas on how to fix this however. Have you tested the base model? Does it do better / worse than the experimental model? |
Long weekend coming up. I have fixed up my mic gain issue & will do more testing of both models to have a proper comparison.
I might have given some unfair results with my not knowing of stupid mic gain levels & not really being aware of speech issues & quirks with my earlier posts/results. I will re-do my testing in a more thorough manner now that quirks & specific issues have been identified.
My current idea to be most helpful towards you atm with the things needing further clarity or discovery is recording video deliberately testing these issues/quirks to see what is possible/not possible/quirk/error etc - Giving a link to the video along with description of how things went aswell as the results with testing the retain.tsv. What sort of things are you looking for or want cleaned in regards to audio? I have a pretty quiet house as it's just me so there is not often any random noises generated apart from maybe my own speech quirks and mouth sounds. I am also thinking maybe i can put together an edited video of sorts displaying things comparing commands with different model. "same scenario, same commands, same doors - different model". Switching between models but using the same commands as video progresses.
In general I do have a sense that the medium model has less errors & i feel i am able to talk normal without feeling the need to be cautious with my speech. The large model is even more so like that. I havn't used the bare bones base model suggested in the main tacspeak page in a while.
When I breach & clear with the command for "c4" with the medium model or the large model It works pretty reliably, unsure if this is because of pure luck & it consistently recognising "c4" as "c2" or whether the language model has some sort of deliberate word detection for that specific thing. I can't remember it not working. Why I seem to have a habit of saying c4 instead of c2 lol.
I would be willing to learn things, have always wanted to learn python, never had a reason to - this peaks my interest very much. I would also be willing to do some Speech training, is this something i can help with?
Thanks for the info, this will def help processing data on the next set of retain audio info i gather, thankyou!
Is there a way for a spoken command to be deliberately denied or stopped if what was spoken is very wrong from the expected command say when someone might be accidentally looking through multiple open doorways? Possibly a system created where tacspeak automatically implements a stop command in a situation where there is a massive difference between spoken & executed commands. |
@madmaximus101 A direct comparison between the base model (I mean the medium lm model when I say base) and the experimental model. What works well in one but not the other is what I'm most concerned with.
I need to quantify it, and I need to test it using other people's speech other than my own. Please if you can, run the tests on the same retained data using:
The finetuning in the experimental model seems to have grossly skewed the word probabilities. This means that there's a larger difference between "c two" and other words including "c four" in the experimental model than the base model. This should both make it more accurate and precise, but also less lenient.
Not yet, otherwise we'd both be wasting our time. At the end of this experiment, a very possible conclusion is that there's no practical benefit to finetuning the model (in fact I have SME advice saying exactly that), and that you'd need to train the model from scratch to see any real benefit. If this is the conclusion, hard test data would be of even greater benefit, as a model from scratch should be even more sensitive to the training process and data put into it. |
Ok got it. It just clicked (lightbulb moment) the retained audio files don't change. The A.I does. Makes sense. |
Issue #14 , requires support / integration from Void. Alternatively, for a flub while speaking, there could be a key phrase to just change the command action to noop (do nothing), e.g. "<dictation> (s- | f-) I messed up". I deliberately haven't tried it or put it in because I believe it's very likely to negatively affect speech recognition accuracy, e.g. a valid command + some noise at the end = noop instead of a valid command. Having said that, it might be worth experimenting, I just have had other priorities.
That's more effort than I put in! I've just been renaming the model folders, for no good reason, but the user_settings should work if you're running tacspeak.exe without additional arguments.
Yep. Ideally playtest with each model for at least a few mission. Then run --test_model using each model on all of the data retained from the playtests (including the playtests where the same model wasn't used). Hopefully that makes sense. |
it's more i got annoyed with having to copy/paste/delete/change folder names to use tacspeak with the model i was wanting to use. This way i don't have to chop & change folder names or move folders around to use tacspeak in-game with a different model lol.
All voice data collected during gameplay. Delete audiofiles containing mistakes or mispoken words/obvious errors - aswell as delete the corrosponding entry in the associated files along with it. The earlier post where you instructed further on the retained.tsv thing will help with this. WIll comment for further assistance if i get stuck on this again. |
Am currently playing idles of march - each video i will be recording for basic at face value/assessment will be using a diff model. Something i have potentially picked up.The restrain command sometimes does not work correctly, i've had this issue in various levels of error regaurdless of model. When it doesn't work correct it's often followed by a move here command, or fall in. Another thing i have picked upIf you tell a team member to mirror the door, wedge the door, c2, gas etc. Sometimes this command will designate red or blue to innitiate the command instead of gold. At first i was like huh....this command would often be repeatable with the same result...then it hit me...the team in question that gets designated to fullfill the command are the only ones with said device...so of course it will either default the command to team with device or will just be designated as gold team ie: Not a problem, will need to investigate this further to confirm. guess what - another thing :DIf looking at a door & issuing on me, fall in. The command breach & clear will be executed. |
Have made 3 videos depicting E-LM M-LM & B-LM. I almost went for editing the vids into one homogeneous vid, but my brain didn't like the idea after all lol. Will be uploading shortly with descriptions & general info of each vid's happenings & quirks. All on Idles of March map. The erroneous red/blue designation of tasks seems to be limited to the E-LM model. Overall, my findings are that the M-LM & B-LM are much more stable speech recognition wise. Across the board there are missteps & wrong commands given even with the M-LM & B-LM this can be imo attributed to not looking at the exact spot intended in the exact moment the command was given i.e.: Not looking exactly at spot to arrest suspect, not looking exactly at door, accidentally looking through multiple doorways. The E-LM model does seem to have a few errors - there is no denying that. What was/is the goal for the E-LM model? To have a custom bespoke speech recognition exactly/specifically designed for Ready or Not & Tacspeak? Smaller file size overall? If there is some sort of specific design choice/pathway/idea for the E-LM I would be willing to brainstorm or help with further refining the idea. I'm definitely nowhere near your level of knowledge with coding though so I wouldn't be able to help with that aspect. I do have a good problem-solving brain lol. fixing up cars, electrics, I.T, networking (Unraid mostly), all self-taught etc - giving u context is all 😀 From what I understand of your previous comment with no point bothering with further speech training on the E-LM if it turns out it's a bust. There's no point speech training the E-LM if the backbone of the A.I Speech recognition of the E-LM is too strict or not as..."flexible?" in the first place? |
Potential solution to users not being proficient in correctly sorting/refining/cleaning & getting good data. Upload entire tacspeak folder to google drive with all data intact? |
No. I don’t need or want anyone to upload their speech data anywhere. I only need the overall test results and any specific findings on what words the experimental model gets wrong that the base model gets right.
It shouldn’t have an effect, or at least not a positive one, that setting is related to the voice activity detector. There isn’t really a direct setting to intentionally capture audio before you press the listen_key.
It’s a test of the model finetuning / training process, to figure out what part of the process needs to be adjusted and/or if it’s (or which areas are) worth further investment of time and effort. There are a number of things I can try to address some of the issues already identified, but I need hard data to narrow it down to specifics, so that I’m not wasting my time. All of the potential fixes will take a great deal of time and effort, beyond what I’ve already put in. There aren’t any design decisions to be made until the finetuning and training process + code is 100% “working”. The most helpful that can be provided right now is test data. After that I can prioritise tasks and put together a plan of attack, doing so before gathering and reviewing test data is a waste of time. |
The easiest workaround is to run powershell as administrator. Otherwise check out this article Make sure to run the relevant “list_” script first before running any “delete_” scripts to make sure only the correct items will be deleted. There’s no undo with powershell. Edit: also run the scripts from the same directory as tacspeak.exe and where the “retain” folder lives, e.g. ./scripts/some_script.ps1. |
These are my first proper test videos with E-LM M-LM B-LM on the Idles of March map.This is the E(Experimental)-LM model being used in this video.https://www.youtube.com/watch?v=3qDAMdt_v_k This is the M-LM model (kaldi_base_model) being used in this video.https://www.youtube.com/watch?v=1fxtZCWRs3w&t=635s This is the B-LM model being used in this videohttps://www.youtube.com/watch?v=o4niN0lOiVg&t=211s I had a suspicion the arrest/restrain command wasn't being issued because I wasn't exactly moused over the exact point of where the restrain command can be given & indeed it is a "mousing over the exact point required for restrain command" issue. When moused over the correct point for restrain to be activated it becomes top of the menu. instead of the door commands being at the top. There is currently no "sub-menu" to navigate to the restrain command if the door command menu is at the top. I feel this is in general a Ready or Not issue overall. I imagine people who play Ready or Not without a speech mod experience the same frustrations. |
I will get onto this, provide video & screenshots if possible. I will find a spot on idles of march that can give the error & then attempt it on other maps. I will test this with the other models also. |
Test Results for E-LM M-LM B-LM .txt files providedB-LM Resultstest_model_output_overall.txt M-LM (base model) Resultstest_model_output_overall.txt E-LM Resultstest_model_output_overall.txt If you would like, for further context i can edit the names of each audio file & take a screenshot so you have context for what the commands were/are in order. This way i can communicate what the audio files said vs what the test spits out. This took me a while to get to this point. I decided it was easier for myself to make clean audio from the start, no muckups, no mistakes, no verbal garbage. Attempting to have no noise be picked up or no accidental freeze or yell. This is harder than i thought haha but i got there. speaking "gold" sometimes will result in the command halt being given. Even when using the B-LM. |
I have discovered the crux of the issue with the quirk relating to commanding one team to do something but the other team does it instead. Red team stacked uphttps://www.youtube.com/watch?v=Yxb3NznJFi4 Blue team stacked uphttps://www.youtube.com/watch?v=WNpaVtaM72M It seems that when red or blue is stacked up that team "takes ownership" of the door if that makes sense? So when looking at said door "claimed" by red or blue the team currently stacked will be the team that follows the command even if you specified the other team to do the command. If gold team is stacked up red or blue can be told to breach & the other team will back off. I believe this quirk to be limited to doors/doorways & hallways where a "hidden door" in the middle of a hallway exists like on Idles of march. Edit: Investigated this issue furtherhttps://www.youtube.com/watch?v=yvEQ_PVDoP0 Maybe a mod that makes all commands available regardless of where your looking, what team is doing what, in one big "command-tree"? that always stays the same. Hypothetically I could see this maybe making Tacspeak usage & commands potentially quirk free? |
I've looked into how to go about setting up the speech training stuff to add to my own tacspeak. Wow....it's alot. |
My posts & finding/results have been rather sporadic & all over the place. Apologies for that, I know it probably wasn't too helpful for proper data. You could consider my posts here a gradual journey of myself discovering & learning as I go. |
@madmaximus101 thanks for testing bud. You’ve done infinitely more than anyone else! I wrote a longer comment but lost it due to router / ISP shenanigans so I’m just going to dot point it below.
|
were the txt files i provided of the 3 models helpfull in someway? or did i provide the data in the wrong manner. |
@madmaximus101 They absolutely were, thank you. The thing I noted was that there were only 33 commands, I average ~30-40 per mission, and that the medium and large lm models had 0% command errors, likely indicating only a single mission was run using just one of the models. Other than that, there’s also “things of note” that aren’t covered by the automated tests that you only get from play-testing. |
your correct i did run one mission with one model and ran the data through the test thing. 1 mission run with each model. got it. From what i remember of earlier posts. |
I have upped my mic gain in-game from 100% to 110% to test if my misrecognised commands are volume related if at all. |
You shouldn't need to do this manually. "YellFreeze" and "NoiseSink" should already be excluded if you included in
Yes please. Both the text and the rule in
In-game mic settings should have zero effect on Tacspeak. Windows Sound Settings and your physical mic gain (or interface if you use one) might affect things if it's near in-audible (or way too loud), but shouldn't if it's within normal range. |
@madmaximus101 if you’re willing / able to test, can you try playtesting a mission, start every spoken command with the correct team colour, listen back to the retained audio, see if the colour gets cut from audio? I’m not sure if this cut audio issue is a user issue (I pressed the listen_key too late) or a code issue, but I need further testing done and my machine is locked down at the moment. Random side note: in testing I thought the model picked up silence as “blue” but listening back I could clearly hear “blue” spoken faintly, even though I was 99% confident I said nothing. I think I’m going crazy. |
actually... i do have one. The Epos gaming app - for my sennheiser gsp 670's. I do have alot of minor static and low level background noise. Will test and get back to you. |
@jwebmeister The Inconsistencies with designating blue or red with mirror or wedge are still present. But very much reduced with my refined mic settings. In this pic I've highlighted the audio file & the retain file reference. In this highlighted example there was no static or distortion/white noise. But hey - overall was a much more improved experience! Vid showing E-LM with adjusted Mic settings with the Epos App.Very much improved experience. Pending results from 100% noise cancelling i may upload a second video and edit comment showing its results as well. |
suggestion - is there a mirror command/wedge command wierdness due to...the...type? of door? does the wedge/mirror spoken verbally have anything to do not specifying the type of door in the command? or the command auto assumes a type of door? hence the wierdness? This red/blue wierdness is less common with the trap command. Just spitballing here. What im thinking probably isn't a thing if you're not having those issues. |
@madmaximus101 yep, please let me know how it goes. If it’s a significant improvement I’ll be surprised, but if so, it narrows down what I need to refine in the training data.
I don’t know what specifically you mean. For it to select blue vs red? If the Tacspeak console says current team, or the correct spoken team, then it’s not an issue with the model. In general, if the Tacspeak console prints the right command, it’s not the models fault. |
nvm was thinking maybe different types of doors were named/coded a particular doortype. dont think thats the case - my bad. |
You can specify “wedge the door” or “wedge the trapped door”, just as one example. It’s all in the grammar module. I haven’t noticed it causing any issues in my testing though. |
I will look at the grammar module more deeply for the proper words/phrases. |
@madmaximus101 don’t worry I figured it out. It was my audio settings. I had a gate setup that was just slightly too slow and/or too high. |
The things I've gathered so far from reviewing your test data + videos @madmaximus101 :
@madmaximus101 can you please review and let me know what's missing? |
I think if you're speaking a command of any kind, but looking at a door/entryway, or suspect/teammate. Regardless of what you say. It will execute whatever it thinks you said that is available in that command menu at the time. "on me" being recognised as pie room might be one of those. I've had consistent misrecognitions with "on me". Not as much with my refined mic settings though. "Fall in" pretty much works all the time. I can't remember it not failing, apart from random red/blue designation. Again - it doesn't happen as often now i've refined my mic settings. Testing E-LM on the postal map. I had quite a few misrecognitions on one door at the offfice where you often come across the corrupt "fbi officer". Have another video showing same settings, same mic settings. more failures with recognition - because i was speaking/testing so much i couldn't speak properly by the point i recorded the video lol. I have quite a few vids now showing a few quirks. Idea: for further context and understanding - might be good to link me a shared link with timestamp on a video you've watched for exact context if u see an issue. There might be some context i didn't explain properly. Thought i'd point out something. The word "mirror" how does the model expect to hear it? Does the model expect to hear a more american sounding Mirreerrr or an aussie Mirraa? The American worded mirror if spoken quickly literally just sounds like Mirrrrrrrer with a buttload or R's lol. |
Post test results + useful remarks here, ideally of both:
, using the same test data, and using the default Ready or Not grammar module.
Useful remarks include:
Important instructions:
retain.tsv
with the correct rules + text, see example workflow near the end of these instructions./scripts/copy_retain_item_cmds_only.ps1
that can be used in PowerShell to copy only "normal commands" out of./retain/
and into./cleanaudio_cmds/
_readyornot.py
grammar module, or very minor modifications, i.e. no new words../tacspeak.exe --test_model './cleanaudio_cmds/retain.tsv' './kaldi_model/' './kaldi_model/lexicon.txt' 4
./scripts/
folder related to cleaning up the retain.tsv and related .wav files.retain.tsv
and go through each line, reviewing the rule and text./retain/
folder in VLC media player on single file loop, pressing 'N' to move to next .wav as I read through each line of retain.tsvretain.tsv
to align with the audio.retain.tsv
, then when I'm done reviewing I run thelist_wav_missing_from_retain_tsv.ps1
first to make sure I'm deleting the right files, then rundelete_wav_missing_from_retain_tsv.ps1
script (option A is preferred, but hey we're all busy and life is too short to spend cleaning all the data).retain.tsv
, then when I'm done reviewing I run thelist_wav_missing_from_retain_tsv.ps1
first to make sure I'm deleting the right files, then rundelete_wav_missing_from_retain_tsv.ps1
script.Example report:
"listen_key_toggle":-1
, usingUSE_NOISE_SINK = True
; also picked up in base model but not as often._readyornot.py
without any modifications('./kaldi_model/', './retain/retain.tsv', 'Command', 'WER', 'Overall -> 5.00 %+/- 9.55 %N=20 C=19 S=1 D=0 I=0')
('./kaldi_model/', './retain/retain.tsv', 'Command', 'CMDERR', {'cmd_not_correct_output': 0, 'cmd_not_correct_rule': 0, 'cmd_not_correct_options': 0, 'cmd_not_recog_output': 0, 'cmd_not_recog_input': 0, 'cmds': 4})
('./kaldi_model_base/', './retain/retain.tsv', 'Command', 'WER', 'Overall -> 5.00 %+/- 9.55 %N=20 C=19 S=0 D=1 I=0')
('./kaldi_model_base/', './retain/retain.tsv', 'Command', 'CMDERR', {'cmd_not_correct_output': 0, 'cmd_not_correct_rule': 0, 'cmd_not_correct_options': 0, 'cmd_not_recog_output': 0, 'cmd_not_recog_input': 0, 'cmds': 4})
The text was updated successfully, but these errors were encountered: