Skip to content

Releases: common-voice/common-voice

Sprint 33: July 29 - Aug 14

14 Aug 23:21
2b3f7fc
Compare
Choose a tag to compare

Please note, Common Voice has been significantly impacted by Mozilla's layoffs. The team is hard at work evaluating a transition plan for Common Voice that will allow it to continue to grow and thrive, and until we have identified next steps for the project, Common Voice will be in maintenance mode. That means that we will fix security vulnerabilities, add localization updates, and import new sentences, but there will be no significant bugfixes or any feature development for the foreseeable future.

This release:

  • Remove dataset survey (#2860)
  • Improved mobile styling for error pages (#2867)
  • Optimized page layout (#2863, #2864)
  • Security update for serialize-javascript (#2876)
  • Fixed bug where selecting a language other than the current locale from the /languages page would redirect to 404 (#2872)
  • Sentence batch #8 from RW partnership (#2873)
  • Removed ~10k difficult to pronounce English sentences (#2874)

Sprint 32: July 15 - July 28

28 Jul 23:18
Compare
Choose a tag to compare

Relatively few feature updates this sprint since the team is in planning mode, trying to figure out our priorities for the second half of 2020.

The biggest change is that we migrated from voice.mozilla.org to commonvoice.mozilla.org. Until the content for the new site is ready, we will be redirecting all traffic to the new domain. Once the new site is launched, there will be a banner on voice.mozilla.org pointing people to the new domain name, and we will be automating redirects for known routes as much as possible. See the Q2 community update on Discourse for more info.

  • Brand new 404/503 page! We've also changed the routing slightly on 404 pages so that now they should be catching localized 404s as well. (#2840)
  • Fixed blurry screenshots on profile settings screen (#2719)
  • Fixed percentage displays on stats dashboard (#2790)
  • Changed sorting behaviour on /languages page so that English is not always last (#2844)
  • Upgraded lodash version to address security vuln
  • A new corpus of rw sentences from our partner org (#2841)
  • Sentence collector updates
  • Localization updates

Sprint 31: July 1 - July 14

14 Jul 22:22
461d087
Compare
Choose a tag to compare
  • Released a version 5.1 of this dataset, as it came to our attention that 5.0 unintentionally altered the column order of the test/train/dev TSV files and included some redundant metadata entries for clips that didn’t actually have valid audio. Get the latest dataset here: https://voice.mozilla.org/datasets
  • Fixed a bug on the download page where switching languages after confirming email and terms wouldn't update the link
  • Updated average duration stats based on latest dataset, so that /languages page reflect estimates closer to the truth (#2814)
  • Enhancement to switch the leaderboard language automatically when toggling between your contribution languages on the dashboard (#2786)
  • Fixed UI bug on the dashboard that made the languages dropdown inaccessible in mobile for certain languages (#2791)
  • Hid leaderboard if there are fewer than 5 contributors for that language
  • Added exemption for Single Sentence Record Limit for smaller languages with fewer than 500k speakers globally
  • Identified logged-in users for FullStory
  • Additional minor repo fixes, including icon hover states, edits to README, etc.
  • Localization and sentence updates, including a large corpus for rw from our partner org

Sprint 30: June 10 - June 30

03 Jul 22:09
Compare
Choose a tag to compare

This was a longer sprint than usual to accommodate Mozilla's virtual all hands, which took place between June 15-19. The biggest thing in this release:

  • A new dataset!! Common Voice Corpus 5 is now available to download, as well as singleword benchmark target segment. See the Discourse post for more information
  • This includes some back-end changes that accommodates displaying and saving multiple dataset versions, in preparation for allowing people to access older versions of the dataset more easily

As you might imagine, that took up most of the team's time. Here are the additional features and bugfixes for this release:

  • Migration that backfilled some data for single-sentence record limit for languages that are close to depleting their stock of available sentences to record, to improve the user experience. We are also actively investigating exemptions for smaller languages for this feature.
  • Added safety check to ensure all client_ids are RFC-4122 compliant GUIDs
  • We fixed a long-standing issue where clips were occasionally being saved as the wrong locale
  • Regular localization and sentence import updates

Sprint 29: May 27 – June 9

10 Jun 21:46
Compare
Choose a tag to compare

Changes included in this release:

  • Removed an unnecessary Sentry capture
  • Upgraded Fluent and our localization tools
  • Added Catalan discourse
  • Decreased lock time to improve sentence importing
  • Fixed bugs / handled edge-cases around missing accounts
  • Further worked on the taxonomy import process
  • Turned on logging for AWS
  • Added Czech Wiki Extraction
  • Added issue templates for bug reports and feature requests
  • Added June 2020 Rwandan Partner sentences
  • Created architecture for single sentence record limit, which will improve our dataset / collection process by ensuring valid clips aren't unnecessarily re-served to clients
  • (Ongoing) updated localizations

Partial Sprint 28: May 23 - May 26

26 May 20:00
Compare
Choose a tag to compare

Our last release was only four days ago, but we’re continuing with today’s regularly scheduled release. This release is mainly focused on bugfixes and stability. Some changes include:

UI and experience:

  • Made some final UI tweaks for taxonomy messages
  • Tweaked tablet UI
  • Removed demographic data from frontend for languages with a small contributor base

Other features:

  • Added a second round of single word benchmark phrases
  • Standardized our server’s logging pipeline by removing JSON log outputs
  • Improved our error monitoring capabilities with Sentry
  • Re-enabled https://voice.mozilla.org/contribute.json

Bugfixes:

  • Fixed contribution activity locale select
  • Prevented open redirects
  • Added checks to gracefully handle queries with missing data, and added tighter error checking to our endpoints
  • Fixed a bad DB query
  • …and a few others

Mid-sprint release: Target Segments

23 May 00:04
Compare
Choose a tag to compare

As mentioned last week, this was a mid-sprint release to enable collection of target segments of clips. The Discourse post has more information about what this means for contributors and how to help out. This release went out on Wednesday, May 20th.

Technical changes specific to target segments:

  • Refactored Clip/Sentence types to account for taxonomy data and to ensure greater consistency between frontend and backend object models
  • Added sentence and clip serving logic to prioritize the target segment first and to limit validations to 2x per sentence
  • Modified clip saving logic to account for taxonomy data
  • Modified clip saving mechanism to pass sentenceId through headers instead of re-hashing the same sentence server-side and introducing inconsistencies
  • Added sentences for the target segment and modified sentence import logic to permit repeated phrases/words across languages
  • Added call-to-action banner on the frontend, which included refactoring existing banner functions to allow for persistent banners with multiple links
  • Added explanatory info on contribution cards for target segment phrases that link out to Discourse
  • Added segment-specific recording minimum length and increased minimum length for regular sentences

Additional features and bugfixes bundled with this release to ensure smoother experience for target segments:

  • Fixed several related bugs for client-side audio processing, including a bug that repeatedly threw the "too quiet" error even if contributors were shouting
  • Sentences that contributors have previously skipped will no longer appear in the sentence queue
  • Added Feature type to allow for more robust feature flag use
  • Modified sentence and clip serving logic to improve query performance
  • Modified clip fetching to minimize unnecessary downloads
  • Added additional indexing to database tables to improve performance
  • Consolidated server-side hashing functions
  • Enabled CSP for all environments for consistency of testing
  • Added useStickyState custom hook to reliably access localeStorage values
  • Fixed bug in the "report sentence/clip" function
  • Enabled server-side Sentry bug tracking

Sprint 27: April 29 - May 12

12 May 17:10
Compare
Choose a tag to compare

A smaller release for this one because we're also gearing up for a mid-sprint feature-release, look out for that in the next few days!

  • Upgraded Typescript from 3.7 to 3.8, which required updating Prettier from 1.x to 2.0.5. This impacted a large number of files at a cosmetic level but did not result in any functional changes
  • Improved user agent detection for recording on Safari
  • Removed the legacy Nubis configuration files
  • Sentences and localization updates
    • Renamed hi-IN to hi after discussion with the community
    • Incorporated corpus extracted from Dutch Europarl transcriptions
    • Added Odia as a contributable language
  • Minor UI bugfixes

Release for Sprint 26: April 15 - April 28

28 Apr 17:14
Compare
Choose a tag to compare

Two big changes this release:

  1. We are now capable of supporting recording on iOS! 🎉🎉🎉 This update makes use of a MediaRecorder polyfill, which currently only works in the native Mobile Safari. Please navigate to https://voice.mozilla.org/speak using Safari on your iOS device and report any bugs you experience using Github issues.

    This also means we've decommissioned the iOS app (which hasn't been updated since 2017) and removed that code from the codebase. We will also be removing the app from the iTunes app store shortly. More detail available in this Discourse post.

  2. We have moved our infrastructure to Kubernetes, to fully align with best practices within Mozilla. The bulk of this work happened last Wednesday and there was a brief outage, and there will be some follow-up changes to config files and documentation as we clean up.

Other updates:

  • Removed outdated UI specs from codebase
  • Initial work in place for benchmarking experiment, including database schema changes and first round of sentences
  • Terms and conditions now available for more locales (details in this issue)
  • Minor type refactoring
  • Minor UI and UX fixes
  • Localization and sentence updates

Sprint 25: April 1 - April 14

14 Apr 17:22
Compare
Choose a tag to compare
  • Localization and sentence collector updates
  • Added maintenance mode view and related routing settings
  • Added better db indexing that should speed up sentence serving for all languages
  • Set up new table schema in preparation for upcoming segmented collection efforts
  • Fixed minor localization bugs
  • Fixed minor UI bugs