Skip to content

Deep Learning Based Visual Table Recogition, Whisper for Multilingual Speech Recognition with 85+ models across various languages and 40+ new models in John Snow Labs NLU 5.1.1

Compare
Choose a tag to compare
@C-K-Loan C-K-Loan released this 08 Jan 02:37
· 1 commit to release/511 since this release

We are incredibly excited to announce NLU 5.1.1 has been released with over 130+ models in 36+ languages including new models based on Whisper for multilingual automatic speech recognition and Deep Learning based Visual Table Recogition using cascade R-CNN

You can now transcribe speech to text with Whispe with 85+ models across 36 languages for Automatic Speech Recognition (ASR).
Additionally, Deep Learning based Visual Table Recogition based on an Cascade mask R-CNN HRNet that features detection of tables within images is now available in NLU 🌟.

Finally, 40+ new models for existing model classes has been added!

Deep Learning based Visual Table Recogition

Cascade R-CNN
Tutorial Notebook

You can now extract tables from images as pandas dataframe in 1 line of code, leveraging Spark OCR's ImageTableDetector, ImageTableCellDetector and ImageCellsToTextTable classes.

The ImageTableDetector is a deep-learning model designed to identify tables within images. It utilizes the CascadeTabNet architecture, which incorporates the Cascade mask Region-based Convolutional Neural Network High-Resolution Network (Cascade mask R-CNN HRNet).

The ImageTableCellDetector, on the other hand, is engineered to pinpoint cells within a table image. Its foundation is an image processing algorithm that identifies both horizontal and vertical lines.

The ImageCellsToTextTable applies Optical Character Recognition (OCR) to regions of cells within an image and returns the recognized text to the outputCol as a TableContainer structure.

It’s important to note that these annotators do not need to be invoked individually in NLU. Instead, you can simply load the image_table_cell2text_table model using the command nlp.load('image_table_cell2text_table'), and then use nlp.predict to make predictions on any document.

Powered by Spark OCR's ImageTableDetector, ImageTableCellDetector, ImageCellsToTextTable
Reference: Cascade R-CNN: High Quality Object Detection and Instance Segmentation

language nlu.load() reference Spark NLP Model Reference
en en.image_table_detector General Model for Table Detection

Whisper for CTC

Whisper

Tutorial Notebook

Whisper Model with a language modeling head on top for Connectionist Temporal Classification (CTC). Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. It transcribe in multiple languages, as well as translate from those languages into English. Whisper was trained and open-sourced that approaches human level robustness and accuracy on English speech recognition.

Powered by Spark-NLP's WhisperForCTC Annotator
Reference: OpenAI Whisper: Robust Speech Recognition via Large-Scale Weak Supervision
Note that at the moment only Spark Versions 3.4 and up are supported.

Language NLU Reference Spark NLP Reference Annotator Class
bg bg.speech2text.whisper.tiny_bulgarian_l asr_whisper_tiny_bulgarian_l WhisperForCTC
cs cs.speech2text.whisper.small_czech_cv11 asr_whisper_small_czech_cv11 WhisperForCTC
da da.speech2text.whisper.danish_small_augmented asr_whisper_danish_small_augmented WhisperForCTC
de de.speech2text.whisper.small_allsnr asr_whisper_small_allsnr WhisperForCTC
el el.speech2text.whisper.samoan_farsipal_e5 asr_whisper_samoan_farsipal_e5 WhisperForCTC
el el.speech2text.whisper.small_greek asr_whisper_small_greek WhisperForCTC
el el.speech2text.whisper.tswana_greek_modern_e1 asr_whisper_tswana_greek_modern_e1 WhisperForCTC
en en.speech2text.whisper.small_english_model asr_personal_whisper_small_english_model WhisperForCTC
en en.speech2text.whisper.base_bulgarian_l asr_whisper_base_bulgarian_l WhisperForCTC
en en.speech2text.whisper.base_english asr_whisper_base_english WhisperForCTC
en en.speech2text.whisper.base_european asr_whisper_base_european WhisperForCTC
en en.speech2text.whisper.base_swedish asr_whisper_base_swedish WhisperForCTC
en en.speech2text.whisper.persian_farsi asr_whisper_persian_farsi WhisperForCTC
en en.speech2text.whisper.small_arabic_cv11 asr_whisper_small_arabic_cv11 WhisperForCTC
en en.speech2text.whisper.small_bak asr_whisper_small_bak WhisperForCTC
en en.speech2text.whisper.small_bengali_subhadeep asr_whisper_small_bengali_subhadeep WhisperForCTC
en en.speech2text.whisper.small_chinese_hk asr_whisper_small_chinese_hk WhisperForCTC
en en.speech2text.whisper.small_chinese_tw_voidful asr_whisper_small_chinese_tw_voidful WhisperForCTC
en en.speech2text.whisper.small_english asr_whisper_small_english WhisperForCTC
en en.speech2text.whisper.small_english_blueraccoon asr_whisper_small_english_blueraccoon WhisperForCTC
en en.speech2text.whisper.small_german asr_whisper_small_german WhisperForCTC
en en.speech2text.whisper.small_hungarian_cv11 asr_whisper_small_hungarian_cv11 WhisperForCTC
en en.speech2text.whisper.small_lithuanian_serbian_v2 asr_whisper_small_lithuanian_serbian_v2 WhisperForCTC
en en.speech2text.whisper.small_mongolian_2 asr_whisper_small_mongolian_2 WhisperForCTC
en en.speech2text.whisper.small_mongolian_3 asr_whisper_small_mongolian_3 WhisperForCTC
en en.speech2text.whisper.small_portuguese_yapeng asr_whisper_small_portuguese_yapeng WhisperForCTC
en en.speech2text.whisper.small_se2 asr_whisper_small_se2 WhisperForCTC
en en.speech2text.whisper.small_spanish_1e_6 asr_whisper_small_spanish_1e_6 WhisperForCTC
en en.speech2text.whisper.small_swe2 asr_whisper_small_swe2 WhisperForCTC
en en.speech2text.whisper.small_swe_davidt123 asr_whisper_small_swe_davidt123 WhisperForCTC
en en.speech2text.whisper.small_swedish_se_afroanton asr_whisper_small_swedish_se_afroanton WhisperForCTC
en en.speech2text.whisper.small_telugu_openslr asr_whisper_small_telugu_openslr WhisperForCTC
en en.speech2text.whisper.small_tonga_zambia asr_whisper_small_tonga_zambia WhisperForCTC
en en.speech2text.whisper.small_urdu_1000_64_1e_05_pretrain_arabic asr_whisper_small_urdu_1000_64_1e_05_pretrain_arabic WhisperForCTC
en en.speech2text.whisper.testrun1 asr_whisper_testrun1 WhisperForCTC
en en.speech2text.whisper.tiny_english asr_whisper_tiny_english WhisperForCTC
en en.speech2text.whisper.tiny_european asr_whisper_tiny_european WhisperForCTC
en en.speech2text.whisper.tiny_german asr_whisper_tiny_german WhisperForCTC
en en.speech2text.whisper.tiny_italian_local asr_whisper_tiny_italian_local WhisperForCTC
en en.speech2text.whisper.tiny_pashto asr_whisper_tiny_pashto WhisperForCTC
en en.speech2text.whisper.tiny_tgl asr_whisper_tiny_tgl WhisperForCTC
es es.speech2text.whisper.small_spanish_ari asr_whisper_small_spanish_ari WhisperForCTC
es es.speech2text.whisper.tiny_spanish_arpagon asr_whisper_tiny_spanish_arpagon WhisperForCTC
fi fi.speech2text.whisper.small_finnish_15k_samples asr_whisper_small_finnish_15k_samples WhisperForCTC
fi fi.speech2text.whisper.small_finnish_sgangireddy asr_whisper_small_finnish_sgangireddy WhisperForCTC
fr fr.speech2text.whisper.small_defined_dot_ai_qc_french_combined_dataset_normalized asr_whisper_small_defined_dot_ai_qc_french_combined_dataset_normalized WhisperForCTC
hi hi.speech2text.whisper.small_french_yocel1 asr_whisper_small_french_yocel1 WhisperForCTC
hi hi.speech2text.whisper.small_hindi_norwegian_tensorboard asr_whisper_small_hindi_norwegian_tensorboard WhisperForCTC
hi hi.speech2text.whisper.small_hindi_shripadbhat asr_whisper_small_hindi_shripadbhat WhisperForCTC
hi hi.speech2text.whisper.small_hindi_xinhuang asr_whisper_small_hindi_xinhuang WhisperForCTC
hy hy.speech2text.whisper.small_armenian asr_whisper_small_armenian WhisperForCTC
it it.speech2text.whisper.small_italian_3 asr_whisper_small_italian_3 WhisperForCTC
it it.speech2text.whisper.tiny_italian_1 asr_whisper_tiny_italian_1 WhisperForCTC
it it.speech2text.whisper.tiny_italian_2 asr_whisper_tiny_italian_2 WhisperForCTC
ja ja.speech2text.whisper.small_japanese_jakeyoo asr_whisper_small_japanese_jakeyoo WhisperForCTC
ja ja.speech2text.whisper.small_japanese_vumichien asr_whisper_small_japanese_vumichien WhisperForCTC
kn kn.speech2text.whisper.kannada_base asr_whisper_kannada_base WhisperForCTC
kn kn.speech2text.whisper.kannada_small asr_whisper_kannada_small WhisperForCTC
ko ko.speech2text.whisper.small_korean_fl asr_whisper_small_korean_fl WhisperForCTC
lt lt.speech2text.whisper.lithuanian_finetune asr_whisper_lithuanian_finetune WhisperForCTC
lt lt.speech2text.whisper.small_lithuanian_deividasm asr_whisper_small_lithuanian_deividasm WhisperForCTC
ml ml.speech2text.whisper.malayalam_first_model asr_whisper_malayalam_first_model WhisperForCTC
mn mn.speech2text.whisper.small_mongolian_1 asr_whisper_small_mongolian_1 WhisperForCTC
ne ne.speech2text.whisper.small_nepali_np asr_whisper_small_nepali_np WhisperForCTC
nl nl.speech2text.whisper.small_dutch asr_whisper_small_dutch WhisperForCTC
no no.speech2text.whisper.small_nob asr_whisper_small_nob WhisperForCTC
pa pa.speech2text.whisper.small_punjabi_eastern asr_whisper_small_punjabi_eastern WhisperForCTC
pl pl.speech2text.whisper.small_polish_aspik101 asr_whisper_small_polish_aspik101 WhisperForCTC
pl pl.speech2text.whisper.tiny_polish asr_whisper_tiny_polish WhisperForCTC
ps ps.speech2text.whisper.small_pashto_ihanif asr_whisper_small_pashto_ihanif WhisperForCTC
sv sv.speech2text.whisper.small_swedish_english asr_whisper_small_swedish_english WhisperForCTC
sv sv.speech2text.whisper.small_swedish_test_3000 asr_whisper_small_swedish_test_3000 WhisperForCTC
sv sv.speech2text.whisper.small_swedish_torileatherman asr_whisper_small_swedish_torileatherman WhisperForCTC
sw sw.speech2text.whisper.small_swahili_pplantinga asr_whisper_small_swahili_pplantinga WhisperForCTC
ta ta.speech2text.whisper.tiny_tamil_example asr_whisper_tiny_tamil_example WhisperForCTC
te te.speech2text.whisper.small_telugu_146h asr_whisper_small_telugu_146h WhisperForCTC
te te.speech2text.whisper.telugu_tiny asr_whisper_telugu_tiny WhisperForCTC
th th.speech2text.whisper.small_thai_napatswift asr_whisper_small_thai_napatswift WhisperForCTC
tt tt.speech2text.whisper.small_tatar asr_whisper_small_tatar WhisperForCTC
uk uk.speech2text.whisper.small_ukrainian asr_whisper_small_ukrainian WhisperForCTC
uz uz.speech2text.whisper.small_uzbek asr_whisper_small_uzbek WhisperForCTC
vi vi.speech2text.whisper.small_vietnamese_tuananh7198 asr_whisper_small_vietnamese_tuananh7198 WhisperForCTC
xx xx.speech2text.whisper.base asr_whisper_base WhisperForCTC
xx xx.speech2text.whisper.base_bengali_trans asr_whisper_base_bengali_trans WhisperForCTC
xx xx.speech2text.whisper.small asr_whisper_small WhisperForCTC
xx xx.speech2text.whisper.tiny asr_whisper_tiny WhisperForCTC
xx xx.speech2text.whisper.tiny_opt asr_whisper_tiny_opt WhisperForCTC
zh zh.speech2text.whisper.small_chinese asr_whisper_small_chinese WhisperForCTC
zh zh.speech2text.whisper.small_chinesebasetw asr_whisper_small_chinesebasetw WhisperForCTC

New Models

Language NLU Reference Spark NLP Reference Annotator Class
en en.assert.sdoh_wip assertion_sdoh_wip AssertionDLModel
en en.assert.vop_clinical_large assertion_vop_clinical_large AssertionDLModel
ar ar.answer_question.bert_qa bert_qa_arap BertForQuestionAnswering
ar ar.ner.bert.by_boda bert_token_classifier_aner BertForTokenClassification
zh zh.ner.bert.base.multi_by_ckiplab bert_token_classifier_base_chinese_ner BertForTokenClassification
tr tr.ner.bert.cased.by_akdeniz27 bert_token_classifier_base_turkish_cased_ner BertForTokenClassification
hu hu.ner.bert.by_nytk bert_token_classifier_named_entity_recognition_nerkor_hu_hungarian BertForTokenClassification
xx xx.answer_question.distil_bert.cased_squadv2 distilbert_qa_base_cased_squadv2 DistilBertForQuestionAnswering
en en.classify.bert_sequence.vop_drug_side_effect bert_sequence_classifier_vop_drug_side_effect MedicalBertForSequenceClassification
en en.classify.bert_sequence.vop_hcp_consult bert_sequence_classifier_vop_hcp_consult MedicalBertForSequenceClassification
en en.classify.bert_sequence.vop_self_report bert_sequence_classifier_vop_self_report MedicalBertForSequenceClassification
en en.classify.bert_sequence.vop_side_effect bert_sequence_classifier_vop_side_effect MedicalBertForSequenceClassification
en en.classify.bert_sequence.vop_sound_medical bert_sequence_classifier_vop_sound_medical MedicalBertForSequenceClassification
en en.med_ner.sdoh ner_sdoh MedicalNerModel
en en.med_ner.sdoh_access_to_healthcare ner_sdoh_access_to_healthcare MedicalNerModel
en en.med_ner.sdoh_community_condition ner_sdoh_community_condition MedicalNerModel
en en.med_ner.sdoh_demographics ner_sdoh_demographics MedicalNerModel
en en.med_ner.sdoh_health_behaviours_problems ner_sdoh_health_behaviours_problems MedicalNerModel
en en.med_ner.sdoh_income_social_status ner_sdoh_income_social_status MedicalNerModel
en en.med_ner.sdoh_social_environment ner_sdoh_social_environment MedicalNerModel
en en.med_ner.sdoh_substance_usage ner_sdoh_substance_usage MedicalNerModel
en en.med_ner.vop ner_vop MedicalNerModel
en en.med_ner.vop_emb_clinical_large ner_vop_emb_clinical_large MedicalNerModel
en en.summarize.clinical_laymen_onnx summarizer_clinical_laymen_onnx MedicalSummarizer
en en.classify.hoc multiclassifierdl_hoc MultiClassifierDLModel
en en.med_ner.chemd_clinical.pipeline ner_chemd_clinical_pipeline PipelineModel
en en.med_ner.jsl_langtest.pipeline ner_jsl_langtest_pipeline PipelineModel
en en.med_ner.living_species.pipeline ner_living_species_pipeline PipelineModel
en en.med_ner.oncology_posology_langtest.pipeline ner_oncology_posology_langtest_pipeline PipelineModel
en en.med_ner.profiling_oncology ner_profiling_oncology PipelineModel
en en.med_ner.profiling_sdoh ner_profiling_sdoh PipelineModel
en en.med_ner.profiling_vop ner_profiling_vop PipelineModel
en en.med_ner.sdoh_langtest.pipeline ner_sdoh_langtest_pipeline PipelineModel
en en.med_ner.supplement_clinical.pipeline ner_supplement_clinical_pipeline PipelineModel
en en.summarize.clinical_guidelines_large.pipeline summarizer_clinical_guidelines_large_pipeline PipelineModel
en en.summarize.clinical_jsl_augmented.pipeline summarizer_clinical_jsl_augmented_pipeline PipelineModel
en en.summarize.clinical_questions.pipeline summarizer_clinical_questions_pipeline PipelineModel
en en.summarize.generic_jsl.pipeline summarizer_generic_jsl_pipeline PipelineModel
en en.summarize.radiology.pipeline summarizer_radiology_pipeline PipelineModel
en en.embed.glove.clinical_large embeddings_clinical_large WordEmbeddingsModel

Minor Features

  • New feature has been incorporated into the Light Pipeline. This feature allows users to enable or disable the usage of the Light Pipeline by setting pipe.is_light_pipe_incompatible=True.

📖 Additional NLU resources


Installation

#PyPI
pip install nlu pyspark