Skip to content

Commit

Permalink
Merge branch 'stage' into production
Browse files Browse the repository at this point in the history
  • Loading branch information
moz-rotimib committed Jan 19, 2024
2 parents 9419c11 + d30c272 commit 908ebc2
Show file tree
Hide file tree
Showing 49 changed files with 1,099 additions and 168 deletions.
24 changes: 12 additions & 12 deletions bundler/package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

17 changes: 10 additions & 7 deletions docs/Sample Bulk Submission - Sheet1.tsv
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
Sentence Source Additional rationale for open license
Six years have passed since I resolved on my present undertaking. Frankenstien, Mary Shelly, 1818, https://www.gutenberg.org/files/42324/42324-h/42324-h.htm My own submission, copyright waived
During her illness, many arguments had been urged to persuade my mother to refrain from attending upon her. Frankenstien, Mary Shelly, 1818, https://www.gutenberg.org/files/42324/42324-h/42324-h.htm My own submission, copyright waived
She died calmly; and her countenance expressed affection even in death. Frankenstien, Mary Shelly, 1818, https://www.gutenberg.org/files/42324/42324-h/42324-h.htm MCV CC0 waiver process - see legal form
My cat is a strange little dude. Jessica Rose (self) MCV CC0 waiver process - see legal form
I should have brought sunscreen. Jessica Rose (self) More than 100 years since publication
Have you read the Doraemon comics yet? Jessica Rose (self) More than 100 years since publication
Sentence (mandatory) Source (mandatory) Additional rationale for open license (mandatory) Sentence Quality Assurance Feedback (optional) O = satisfactory sentence, X = unsatisfactory sentence
Six years have passed since I resolved on my present undertaking. Frankenstien, Mary Shelly, 1818, https://www.gutenberg.org/files/42324/42324-h/42324-h.htm My own submission, copyright waived O
During her illness, many arguments had been urged to persuade my mother to refrain from attending upon her. Frankenstien, Mary Shelly, 1818, https://www.gutenberg.org/files/42324/42324-h/42324-h.htm My own submission, copyright waived O
She died calmly; and her countenance expressed affection even in death. Frankenstien, Mary Shelly, 1818, https://www.gutenberg.org/files/42324/42324-h/42324-h.htm MCV CC0 waiver process - see legal form O
My cat is a strange little dude. Jessica Rose (self) MCV CC0 waiver process - see legal form O
I should have brought sunscreen. Jessica Rose (self) More than 100 years since publication O
Have you read the Doraemon comics yet? Jessica Rose (self) More than 100 years since publication O
Her don't like pizza. Jane Doe (self) My own submission, copyright waived X
The cat was sitin on the windowsill. Jane Doe (self) My own submission, copyright waived X
The 3 elephants were playing in the mud John Doe (self) My own submission, copyright waived X
19 changes: 11 additions & 8 deletions docs/submitting-bulk-sentences.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,17 @@ Please format your bulk sentences into a TSV file with two columns, one containi

The more information you are able to provide in the Source column, the easier it will be to get your bulk sentence submission validated.

| Sentence | Source | Additional rationale for open license
|---|---|---|
| Six years have passed since I resolved on my present undertaking. | Frankenstien, Mary Shelly, 1818, https://www.gutenberg.org/files/42324/42324-h/42324-h.htm | My own submission, copyright waived |
| During her illness, many arguments had been urged to persuade my mother to refrain from attending upon her. | Frankenstien, Mary Shelly, 1818, https://www.gutenberg.org/files/42324/42324-h/42324-h.htm | My own submission, copyright waived |
| She died calmly; and her countenance expressed affection even in death. | Frankenstien, Mary Shelly, 1818, https://www.gutenberg.org/files/42324/42324-h/42324-h.htm | MCV CC0 waiver process - see legal form |
| My cat is a strange little dude. | Jessica Rose (self) | MCV CC0 waiver process - see legal form |
| I should have brought sunscreen. | Jessica Rose (self) | More than 100 years since publication |
| Have you read the Doraemon comics yet? | Jessica Rose (self) | More than 100 years since publication |
| Sentence | Source | Additional rationale for open license | Sentence Quality Assurance Feedback (optional) O = satisfactory sentence, X = unsatisfactory sentence
|---|---|---|---|
| Six years have passed since I resolved on my present undertaking. | Frankenstien, Mary Shelly, 1818, https://www.gutenberg.org/files/42324/42324-h/42324-h.htm | My own submission, copyright waived | O
| During her illness, many arguments had been urged to persuade my mother to refrain from attending upon her. | Frankenstien, Mary Shelly, 1818, https://www.gutenberg.org/files/42324/42324-h/42324-h.htm | My own submission, copyright waived | O
| She died calmly; and her countenance expressed affection even in death. | Frankenstien, Mary Shelly, 1818, https://www.gutenberg.org/files/42324/42324-h/42324-h.htm | MCV CC0 waiver process - see legal form | O
| My cat is a strange little dude. | Jessica Rose (self) | MCV CC0 waiver process - see legal form | O
| I should have brought sunscreen. | Jessica Rose (self) | More than 100 years since publication | O
| Have you read the Doraemon comics yet? | Jessica Rose (self) | More than 100 years since publication | O
| Her don't like pizza. | Jane Doe (self) | My own submission, copyright waived | X
| The cat was sitin on the windowsill. | Jane Doe (self) | My own submission, copyright waived | X
| The 3 elephants were playing in the mud | Jane Doe (self) | My own submission, copyright waived | X

You will need a Github account to submit bulk sentences to Common Voice. If you don’t currently have an account, you can [sign up for one here](https://github.com/signup).

Expand Down
10 changes: 8 additions & 2 deletions server/src/infrastructure/storage/storage.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,13 @@ import { taskEither as TE } from 'fp-ts'
import { getConfig } from '../../config-helper'
import { Readable } from 'stream'
import { StatusCodes } from 'http-status-codes'
import { Metadata } from '@google-cloud/storage/build/src/nodejs-common'

const TWELVE_HOURS_IN_MS = 1000 * 60 * 60 * 12

export type Metadata = {
size: number
}

const storage =
getConfig().ENVIRONMENT === 'local'
? new Storage({
Expand Down Expand Up @@ -124,7 +127,10 @@ const getMetadata =
return TE.tryCatch(
async () => {
const [metadata] = await storage.bucket(bucket).file(path).getMetadata()
return metadata

return {
size: Number(metadata.size)
}
},
(err: Error) => err
)
Expand Down
42 changes: 16 additions & 26 deletions server/src/lib/bucket.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import { ClientClip } from './takeout'
import * as Sentry from '@sentry/node'
import { pipe } from 'fp-ts/lib/function'
import {
Metadata,
deleteFileFromBucket,
doesFileExistInBucket,
downloadFileFromBucket,
Expand All @@ -16,7 +17,6 @@ import {
uploadToBucket,
} from '../infrastructure/storage/storage'
import { task as T, taskEither as TE } from 'fp-ts'
import { Metadata } from '@google-cloud/storage/build/src/nodejs-common'
import * as archiver from 'archiver'
import { zip } from 'fp-ts/lib/ReadonlyArray'

Expand Down Expand Up @@ -50,7 +50,6 @@ export default class Bucket {
TE.getOrElse(() => T.of(`Cannot get signed url for ${key}`))
)()


return url
}

Expand Down Expand Up @@ -168,33 +167,24 @@ export default class Bucket {

const bucket = getConfig().CLIP_BUCKET_NAME
const passThrough = new PassThrough()

const downloadList = paths.map(path => downloadFileFromBucket(bucket)(path))

const buffers = await pipe(
downloadList,
TE.sequenceArray,
TE.match(
e => {
console.log(e)
return [] as Buffer[]
},
buffers => buffers
)
)()

const bufferList = zip(buffers)(paths)

const archive = archiver('zip', { zlib: { level: 6 } })

archive.pipe(passThrough)

bufferList.forEach(([path, buffer]) =>
archive.append(buffer, {
name: `takeout_${takeout.id}_pt_${chunkIndex}/${
path.split('/').length > 1 ? path.split('/')[1] : path
}`,
})
)
for (const path of paths) {
await pipe(
path,
downloadFileFromBucket(bucket),
TE.map(buffer => ({ path, buffer })),
TE.map(clip =>
archive.append(clip.buffer, {
name: `takeout_${takeout.id}_pt_${chunkIndex}/${
path.split('/').length > 1 ? path.split('/')[1] : path
}`,
})
)
)()
}

archive.finalize()

Expand Down
23 changes: 9 additions & 14 deletions server/src/lib/takeout.ts
Original file line number Diff line number Diff line change
Expand Up @@ -173,12 +173,12 @@ export default class Takeout {
// .map((keys, offset) => bucket.zipTakeoutFilesToS3(takeout, offset, keys)));
const fileSizes: {size: number}[] = [];

chunkedClips.forEach(async (chunk: string[], index: number) => {
for (const [index, chunk] of chunkedClips.entries()) {
fileSizes.push(
await this.bucket.zipTakeoutFilesToS3(takeout, index, chunk)
);
await job.progress(Math.ceil((100 * index) / chunkedClips.length));
});
}

fileSizes.push(await this.bucket.uploadClipMetadata(takeout, clips));
const totalSize = fileSizes.reduce(
Expand Down Expand Up @@ -370,17 +370,12 @@ export default class Takeout {
}

private static splitIntoChunks(paths: string[]): string[][] {
const todo = [...paths];
const chunks = [];

do {
const currentChunk = [];
while (todo.length && currentChunk.length < kChunkMaxFiles) {
const clip = todo.pop();
currentChunk.push(clip);
}
chunks.push(currentChunk);
} while (todo.length);
return chunks;
var result = [];

for (let i = 0; i < paths.length; i += kChunkMaxFiles) {
result.push(paths.slice(i, i + kChunkMaxFiles));
}

return result;
}
}
2 changes: 2 additions & 0 deletions web/locales/am/messages.ftl
Original file line number Diff line number Diff line change
Expand Up @@ -1806,6 +1806,8 @@ sc-redirect-page-subtitle-2 = ጥያቄዎችን በ<matrixLink>ማትሪክስ<
sc-bulk-upload-header = <icon></icon> የሕዝብ ጎራ ዓረፍተ-ነገሮችን ይጫኑ
sc-bulk-upload-instruction = ፋይልዎን ወደዚህ ይጎትቱት ወይም <uploadButton>ለመጫን ጠቅ ያድርጉ</uploadButton>
sc-bulk-upload-instruction-drop = ለመጫን ፋይልዎን እዚህ ጣል ያድርጉ
bulk-upload-additional-information = ስለዚህ ፋይል ማቅረብ የሚፈልጉት ተጨማሪ መረጃ ካለ፣ እባክዎን <emailFragment>commonvoice@mozilla.com</emailFragment>ን ያግኙ
template-file-additional-information = ስለዚህ ፋይል በአብነት ውስጥ ያልተካተተ ተጨማሪ መረጃ ካለ፣ እባክዎን <emailFragment>commonvoice@mozilla.com</emailFragment>ን ያግኙ
try-upload-again = ፋይልዎን ወደዚህ በመጎተት እንደገና ይሞክሩ
try-upload-again-md = እንደገና ለመጫን ይሞክሩ
select-file = ፋይል ይምረጡ
Expand Down
3 changes: 3 additions & 0 deletions web/locales/ca/messages.ftl
Original file line number Diff line number Diff line change
Expand Up @@ -744,6 +744,7 @@ number-of-voices = Nombre de veus
splits = Divisions
email-to-download = Introduïu l'adreça electrònica per baixar
why-email = <b>Per què una adreça electrònica?</b> És una forma de contacte en cas que ens haguéssim de posar en contacte en un futur per canvis en el conjunt de dades.
why-donate = Per què ho demaneu?
confirm-size = Estic preparat per a iniciar una baixada de <b>{ $size }</b>
size-gigabyte = GB
size-megabyte = MB
Expand Down Expand Up @@ -1742,6 +1743,8 @@ sc-redirect-page-subtitle-2 = Feu-nos preguntes a <matrixLink>Matrix</matrixLink
sc-bulk-upload-header = Pugeu <icon></icon> frases de domini públic
sc-bulk-upload-instruction = Arrossegueu el fitxer aquí o <uploadButton>feu clic per a pujar-lo</uploadButton>
sc-bulk-upload-instruction-drop = Deixeu anar el fitxer aquí per a pujar-lo
bulk-upload-additional-information = Si hi ha informació addicional que voleu proporcionar sobre aquest fitxer, poseu-vos en contacte amb <emailFragment>commonvoice@mozilla.com</emailFragment>
template-file-additional-information = Si hi ha informació addicional que voleu proporcionar sobre aquest fitxer que no s'inclou en la plantilla, poseu-vos en contacte amb <emailFragment>commonvoice@mozilla.com</emailFragment>
try-upload-again = Torneu-ho a provar arrossegant el fitxer aquí
try-upload-again-md = Proveu de pujar-lo de nou
select-file = Seleccioneu el fitxer
Expand Down
2 changes: 2 additions & 0 deletions web/locales/cs/messages.ftl
Original file line number Diff line number Diff line change
Expand Up @@ -1795,6 +1795,8 @@ sc-redirect-page-subtitle-2 = Ptejte se na <matrixLink>Matrixu</matrixLink>, <di
sc-bulk-upload-header = Nahrát <icon></icon> věty jako volné dílo
sc-bulk-upload-instruction = Přetáhněte sem soubor nebo <uploadButton>klepněte pro nahrání</uploadButton>
sc-bulk-upload-instruction-drop = Sem přetáhněte soubor na nahrání
bulk-upload-additional-information = Pokud chcete k tomuto souboru poskytnout další informace, kontaktujte nás prosím na adrese <emailFragment>commonvoice@mozilla.com</emailFragment>.
template-file-additional-information = Pokud chcete o tomto souboru poskytnout další informace, které nejsou obsaženy v šabloně, kontaktujte nás prosím na adrese <emailFragment>commonvoice@mozilla.com</emailFragment>.
try-upload-again = Zkuste to znovu přesunutím souboru sem
try-upload-again-md = Zkuste nahrát znovu
select-file = Vybrat soubor
Expand Down
5 changes: 4 additions & 1 deletion web/locales/cy/messages.ftl
Original file line number Diff line number Diff line change
Expand Up @@ -234,7 +234,7 @@ speak-now = Siaradwch nawr
datasets = Setiau data
languages = Ieithoedd
about = Amdanom Ni
partner = Partner
partner = Partneru
profile = Proffil
help = Cymorth
contact = Cysylltu
Expand Down Expand Up @@ -764,6 +764,7 @@ number-of-voices = Nifer y Lleisiau
splits = Rhannu
email-to-download = Rhowch E-bost i'w Lwytho i Lawr
why-email = <b> Pam e-bost? </ b> Efallai y bydd angen i chi gysylltu â chi yn y dyfodol ynghylch newidiadau i'r set ddata, mae e-bost yn rhoi pwynt cyswllt inni.
why-donate = Pam ydych chi'n gofyn?
confirm-size = Rydych yn barod i gychwyn llwytho i lawr <b>{ $size }</b>
size-gigabyte = GB
size-megabyte = MB
Expand Down Expand Up @@ -1846,6 +1847,8 @@ sc-redirect-page-subtitle-2 = Gofynnwch gwestiynau i ni ar <matrixLink>Matrics</
sc-bulk-upload-header = Llwytho i fyny <icon></icon> brawddegau parth cyhoeddus
sc-bulk-upload-instruction = Llusgwch eich ffeil yma neu <uploadButton>cliciwch i'w llwytho i fyny</uploadButton>
sc-bulk-upload-instruction-drop = Gollwng ffeil yma i'w llwytho i fyny
bulk-upload-additional-information = Os oes unrhyw wybodaeth ychwanegol yr hoffech ei darparu am y ffeil hon, cysylltwch â <emailFragment>commonvoice@mozilla.com</emailFragment>
template-file-additional-information = Os oes unrhyw wybodaeth ychwanegol yr hoffech ei darparu am y ffeil hon nad yw wedi'i chynnwys yn y templed, cysylltwch â <emailFragment>commonvoice@mozilla.com</emailFragment>
try-upload-again = Ceisiwch eto trwy lusgo'ch ffeil yma
try-upload-again-md = Ceisiwch lwytho i fyny eto
select-file = Dewis Ffeil
Expand Down
2 changes: 2 additions & 0 deletions web/locales/de/messages.ftl
Original file line number Diff line number Diff line change
Expand Up @@ -1816,6 +1816,8 @@ sc-redirect-page-subtitle-2 = Stellen Sie uns Fragen auf <matrixLink>Matrix</mat
sc-bulk-upload-header = Laden Sie <icon></icon> gemeinfreie Sätze hoch
sc-bulk-upload-instruction = Ziehen Sie Ihre Datei hierher oder <uploadButton>klicken Sie zum Hochladen</uploadButton>
sc-bulk-upload-instruction-drop = Datei zum Hochladen hier ablegen
bulk-upload-additional-information = Wenn Sie weitere Informationen zu dieser Datei angeben möchten, wenden Sie sich bitte an <emailFragment>commonvoice@mozilla.com</emailFragment>
template-file-additional-information = Wenn Sie zusätzliche Informationen zu dieser Datei angeben möchten, die nicht in der Vorlage enthalten sind, wenden Sie sich bitte an <emailFragment>commonvoice@mozilla.com</emailFragment>
try-upload-again = Versuchen Sie es erneut, indem Sie Ihre Datei hierher ziehen
try-upload-again-md = Hochladen erneut versuchen
select-file = Datei auswählen
Expand Down
2 changes: 2 additions & 0 deletions web/locales/el/messages.ftl
Original file line number Diff line number Diff line change
Expand Up @@ -1747,6 +1747,8 @@ sc-redirect-page-subtitle-2 = Κάντε μας ερωτήσεις στο <matri
sc-bulk-upload-header = <icon></icon> Μεταφόρτωση προτάσεων δημόσιου τομέα
sc-bulk-upload-instruction = Σύρετε το αρχείο σας εδώ ή <uploadButton>κάντε κλικ για μεταφόρτωση</uploadButton>
sc-bulk-upload-instruction-drop = Σύρετε το αρχείο εδώ για μεταφόρτωση
bulk-upload-additional-information = Αν θέλετε να υποβάλετε επιπρόσθετες πληροφορίες σχετικά με αυτό το αρχείο, παρακαλούμε επικοινωνήστε με το <emailFragment>commonvoice@mozilla.com</emailFragment>
template-file-additional-information = Αν υπάρχουν επιπρόσθετες πληροφορίες που θέλετε να υποβάλετε σχετικά με αυτό το αρχείο που δεν περιλαμβάνονται στο πρότυπο, παρακαλούμε επικοινωνήστε με το <emailFragment>commonvoice@mozilla.com</emailFragment>
try-upload-again = Δοκιμάστε ξανά σύροντας το αρχείο σας εδώ
try-upload-again-md = Δοκιμάστε να μεταφορτώσετε ξανά
select-file = Επιλογή αρχείου
Expand Down
1 change: 1 addition & 0 deletions web/locales/en/messages.ftl
Original file line number Diff line number Diff line change
Expand Up @@ -765,6 +765,7 @@ validated-hr-total = Validated Hr. Total
overall-hr-total = Overall Hr. Total
cv-license = License
audio-format = Audio Format
dataset-splits = Splits (Age and Sex)
number-of-voices = Number of Voices
splits = Splits
email-to-download = Enter Email to Download
Expand Down
Loading

0 comments on commit 908ebc2

Please sign in to comment.